KIT | KIT-Bibliothek | Impressum | Datenschutz

Information Maximizing Curriculum: A Curriculum-Based Approach for Learning Versatile Skills

Blessing, Denis 1; Celik, Onur 1; Jia, Xiaogang 1; Reuss, Moritz 1; Li, Maximilian ORCID iD icon 1; Lioutikov, Rudolf 1; Neumann, Gerhard 1
1 Institut für Anthropomatik und Robotik (IAR), Karlsruher Institut für Technologie (KIT)

Abstract:

Imitation learning uses data for training policies to solve complex tasks. However, when the training data is collected from human demonstrators, it often leads to multimodal distributions because of the variability in human actions. Most imitation learning methods rely on a maximum likelihood (ML) objective to learn a parameterized policy, but this can result in suboptimal or unsafe behavior due to the mode-averaging property of the ML objective. In this work, we propose Information Maximizing Curriculum, a curriculum-based approach that assigns a weight to each data point and encourages the model to specialize in the data it can represent, effectively mitigating the mode-averaging problem by allowing the model to ignore data from modes it cannot represent. To cover all modes and thus, enable diverse behavior, we extend our approach to a mixture of experts (MoE) policy, where each mixture component selects its own subset of the training data for learning. A novel, maximum entropy-based objective is proposed to achieve full coverage of the dataset, thereby enabling the policy to encompass all modes within the data distribution. We demonstrate the effectiveness of our approach on complex simulated control tasks using diverse human demonstrations, achieving superior performance compared to state-of-the-art methods.


Zugehörige Institution(en) am KIT Institut für Anthropomatik und Robotik (IAR)
Publikationstyp Proceedingsbeitrag
Publikationsjahr 2024
Sprache Englisch
Identifikator ISSN: 1049-5258
KITopen-ID: 1000168977
Erschienen in Advances in Neural Information Processing Systems. Ed.: A. Oh
Veranstaltung 37th Conference on Neural Information Processing Systems (NeurIPS 2023), New Orleans, LA, USA, 10.12.2023 – 16.12.2023
Verlag MIT-Press
Serie NeurIPS Proceedings ; 36
Bemerkung zur Veröffentlichung in press
Externe Relationen Abstract/Volltext
Schlagwörter Imitation learning, Versatile skill learning, Mixture of Experts
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page