DIME:Diffusion-Based Maximum Entropy Reinforcement Learning

Celik, Onur; Li, Zechu; Blessing, Denis; Li, Ge; Palenicek, Daniel; Peters, Jan; Chalvatzaki, Georgia; Neumann, Gerhard

doi:10.48550/arXiv.2502.02316

DIME:Diffusion-Based Maximum Entropy Reinforcement Learning

Celik, Onur; Li, Zechu; Blessing, Denis; Li, Ge

¹; Palenicek, Daniel; Peters, Jan; Chalvatzaki, Georgia; Neumann, Gerhard ¹
¹ Institut für Anthropomatik und Robotik (IAR), Karlsruher Institut für Technologie (KIT)

Abstract:

Maximum entropy reinforcement learning (MaxEnt-RL) has become the standard approach to RL due to its beneficial exploration properties. Traditionally, policies are parameterized using Gaussian distributions, which significantly limits their representational capacity. Diffusion-based policies offer a more expressive alternative, yet integrating them into MaxEnt-RL poses challenges-primarily due to the intractability of computing their marginal entropy. To overcome this, we propose Diffusion-Based Maximum Entropy RL (DIME). \emph{DIME} leverages recent advances in approximate inference with diffusion models to derive a lower bound on the maximum entropy objective. Additionally, we propose a policy iteration scheme that provably converges to the optimal diffusion policy. Our method enables the use of expressive diffusion-based policies while retaining the principled exploration benefits of MaxEnt-RL, significantly outperforming other diffusion-based methods on challenging high-dimensional control benchmarks. It is also competitive with state-of-the-art non-diffusion based RL methods while requiring fewer algorithmic design choices and smaller update-to-data ratios, reducing computational complexity.

KITopen-Download

Volltext

DOI: 10.5445/IR/1000189671

Veröffentlicht am 15.01.2026

Externe Links

Originalveröffentlichung
DOI: 10.48550/arXiv.2502.02316

Dimensions

Export

Statistiken

Seitenaufrufe: 91
seit 15.01.2026

Downloads: 46
seit 17.01.2026

Zugehörige Institution(en) am KIT	Institut für Anthropomatik und Robotik (IAR)
Publikationstyp	Forschungsbericht/Preprint
Publikationsdatum	10.06.2025
Sprache	Englisch
Identifikator	KITopen-ID: 1000189671
Verlag	arxiv
Umfang	20 S.
Schlagwörter	Machine Learning (cs.LG)
Nachgewiesen in	Dimensions OpenAlex arXiv
Relationen in KITopen	Verweist auf DIME:Diffusion-Based Maximum Entropy Reinforcement Learning. Celik, Onur; Li, Zechu; Blessing, Denis; Li, Ge; Palenicek, Daniel; Peters, Jan; Chalvatzaki, Georgia; Neumann, Gerhard (2025) Forschungsbericht/Preprint (1000189671)

Repository KITopen

DIME:Diffusion-Based Maximum Entropy Reinforcement Learning

Abstract: