Goal-Conditioned Imitation Learning using Score-based Diffusion Policies

Reuss, Moritz; Li, Maximilian; Jia, Xiaogang; Lioutikov, Rudolf

doi:10.48550/arXiv.2304.02532

Goal-Conditioned Imitation Learning using Score-based Diffusion Policies

¹; Jia, Xiaogang ¹; Lioutikov, Rudolf ¹
¹ Institut für Anthropomatik und Robotik (IAR), Karlsruher Institut für Technologie (KIT)

Abstract:

We propose a new policy representation based on score-based diffusion models (SDMs). We apply our new policy representation in the domain of Goal-Conditioned Imitation Learning (GCIL) to learn general-purpose goal-specified policies from large uncurated datasets without rewards. Our new goal-conditioned policy architecture "

BE

havior generation with

S

O

re-based Diffusion Policies" (BESO) leverages a generative, score-based diffusion model as its policy. BESO decouples the learning of the score model from the inference sampling process, and, hence allows for fast sampling strategies to generate goal-specified behavior in just 3 denoising steps, compared to 30+ steps of other diffusion based policies. Furthermore, BESO is highly expressive and can effectively capture multi-modality present in the solution space of the play data. Unlike previous methods such as Latent Plans or C-Bet, BESO does not rely on complex hierarchical policies or additional clustering for effective goal-conditioned behavior learning. Finally, we show how BESO can even be used to learn a goal-independent policy from play-data using classifier-free guidance. ... mehr

Zugehörige Institution(en) am KIT	Institut für Anthropomatik und Robotik (IAR)
Publikationstyp	Forschungsbericht/Preprint
Publikationsjahr	2023
Sprache	Englisch
Identifikator	KITopen-ID: 1000158726
Verlag	arxiv
Umfang	14 S.
Schlagwörter	Machine Learning (cs.LG); Robotics (cs.RO)
Nachgewiesen in	Dimensions arXiv OpenAlex
Globale Ziele für nachhaltige Entwicklung

KITopen-Download

Volltext

DOI: 10.5445/IR/1000158726

Veröffentlicht am 12.05.2023

Externe Links

Originalveröffentlichung
DOI: 10.48550/arXiv.2304.02532

Dimensions

Export

Statistiken

Seitenaufrufe: 106
seit 12.05.2023

Downloads: 171
seit 25.06.2023

Repository KITopen

Goal-Conditioned Imitation Learning using Score-based Diffusion Policies

Abstract: