KIT | KIT-Bibliothek | Impressum | Datenschutz

DIME: Diffusion-Based Maximum Entropy Reinforcement Learning

Celik, Onur; Li, Zechu; Blessing, Denis; Li, Ge ORCID iD icon 1; Palenicek, Daniel; Peters, Jan; Chalvatzaki, Georgia; Neumann, Gerhard 1
1 Institut für Anthropomatik und Robotik (IAR), Karlsruher Institut für Technologie (KIT)

Abstract (englisch):

Maximum entropy reinforcement learning (MaxEnt-RL) has become the standard approach to RL due to its beneficial exploration properties. Traditionally, policies are parameterized using Gaussian distributions, which significantly limits their representational capacity. Diffusion-based policies offer a more expressive alternative, yet integrating them into MaxEnt-RL poses challenges—primarily due to the intractability of computing their marginal entropy. To overcome this, we propose Diffusion-Based Maximum Entropy RL (DIME). DIME leverages recent advances in approximate inference with diffusion models to derive a lower bound on the maximum entropy objective. Additionally, we propose a policy iteration scheme that provably converges to the optimal diffusion policy. Our method enables the use of expressive diffusion-based policies while retaining the principled exploration benefits of MaxEnt-RL, significantly outperforming other diffusion-based methods on challenging high-dimensional control benchmarks. It is also competitive with state-of-the-art non-diffusion based RL methods while requiring fewer algorithmic design choices and smaller update-to-data ratios, reducing computational complexity.


Verlagsausgabe §
DOI: 10.5445/IR/1000189666
Veröffentlicht am 15.01.2026
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Anthropomatik und Robotik (IAR)
Publikationstyp Proceedingsbeitrag
Publikationsdatum 01.05.2025
Sprache Englisch
Identifikator KITopen-ID: 1000189666
Erschienen in Forty-second International Conference on Machine Learning; Vancouver, Kanada, 13.-19.07.2025.
Veranstaltung 42nd International Conference on Machine Learning (ICML 2025), Vancouver, Kanada, 13.07.2025 – 19.07.2025
Verlag ICML
Seiten 20 S.
Externe Relationen Siehe auch
Schlagwörter Reinforcement Learning, Diffusion Models, Diffusion Based Reinforcement Learning, Maximum Entropy Reinforcement Learning
KIT – Die Universität in der Helmholtz-Gemeinschaft
KITopen Landing Page