Versatile Inverse Reinforcement Learning via Cumulative Rewards

Freymuth, Niklas; Becker, Philipp; Neumann, Gerhard

Versatile Inverse Reinforcement Learning via Cumulative Rewards

Freymuth, Niklas; Becker, Philipp; Neumann, Gerhard

Abstract:

Inverse Reinforcement Learning infers a reward function from expert demonstrations, aiming to encode the behavior and intentions of the expert. Current approaches usually do this with generative and uni-modal models, meaning that they encode a single behavior. In the common setting, where there are various solutions to a problem and the experts show versatile behavior this severely limits the generalization capabilities of these methods. We propose a novel method for Inverse Reinforcement Learning that overcomes these problems by formulating the recovered reward as a sum of iteratively trained discriminators. We show on simulated tasks that our approach is able to recover general, high-quality reward functions and produces policies of the same quality as behavioral cloning approaches designed for versatile behavior.

Zugehörige Institution(en) am KIT	Institut für Anthropomatik und Robotik (IAR)
Publikationstyp	Proceedingsbeitrag
Publikationsdatum	14.12.2021
Sprache	Englisch
Identifikator	KITopen-ID: 1000140287
Erschienen in	NeurIPS 2021 Workshop on Robot Learning: Self-Supervised and Lifelong Learning, Virtual
Veranstaltung	35th Annual Conference on Neural Information Processing Systems (NIPS 2021), Online, 06.12.2021 – 14.12.2021
Bemerkung zur Veröffentlichung	Workshop on Robot Learning: Self-Supervised and Lifelong Learning. 6.12.2021
Nachgewiesen in	arXiv

KITopen-Download

Postprint

DOI: 10.5445/IR/1000140287

Veröffentlicht am 10.12.2021

Export

Statistiken

Seitenaufrufe: 155
seit 25.11.2021

Downloads: 109
seit 11.12.2021

Repository KITopen

Versatile Inverse Reinforcement Learning via Cumulative Rewards

Abstract: