Learning Reward Models for Cooperative Trajectory Planning with Inverse Reinforcement Learning and Monte Carlo Tree Search

Kurzer, Karl; Bitzer, Matthias; Zöllner, J. Marius

doi:10.48550/arXiv.2202.06443

Learning Reward Models for Cooperative Trajectory Planning with Inverse Reinforcement Learning and Monte Carlo Tree Search

¹; Bitzer, Matthias; Zöllner, J. Marius ¹
¹ Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB), Karlsruher Institut für Technologie (KIT)

Abstract:

Cooperative trajectory planning methods for automated vehicles can solve traffic scenarios that require a high degree of cooperation between traffic participants. However, for cooperative systems to integrate into human-centered traffic, the automated systems must behave human-like so that humans can anticipate the system's decisions. While Reinforcement Learning has made remarkable progress in solving the decision-making part, it is non-trivial to parameterize a reward model that yields predictable actions. This work employs feature-based Maximum Entropy Inverse Reinforcement Learning combined with Monte Carlo Tree Search to learn reward models that maximize the likelihood of recorded multi-agent cooperative expert trajectories. The evaluation demonstrates that the approach can recover a reasonable reward model that mimics the expert and performs similarly to a manually tuned baseline reward model.

Zugehörige Institution(en) am KIT	Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)
Publikationstyp	Forschungsbericht/Preprint
Publikationsdatum	06.05.2022
Sprache	Englisch
Identifikator	KITopen-ID: 1000150006
Nachgewiesen in	arXiv OpenAlex Dimensions
Relationen in KITopen	Verweist auf Learning Reward Models for Cooperative Trajectory Planning with Inverse Reinforcement Learning and Monte Carlo Tree Search. Kurzer, Karl; Bitzer, Matthias; Zöllner, J. Marius (2022) Proceedingsbeitrag (1000149992)
Globale Ziele für nachhaltige Entwicklung

Externe Links

Originalveröffentlichung
DOI: 10.48550/arXiv.2202.06443

Dimensions

Export

Statistiken

Seitenaufrufe: 45
seit 19.08.2022

Repository KITopen

Learning Reward Models for Cooperative Trajectory Planning with Inverse Reinforcement Learning and Monte Carlo Tree Search

Abstract: