KIT | KIT-Bibliothek | Impressum | Datenschutz

LLM-Based Multimodal Prompting for Adaptive Robot Control in Production Systems

Koch, Dominik ORCID iD icon 1; Wolber, Jakob 1; Shi, Zhuo; Cheng Ji, Bo; Bretz, Lucas ORCID iD icon; Geiser, Alexander 1; Baer, Felix; Benfer, Martin ORCID iD icon 1; Stamer, Florian ORCID iD icon; Lanza, Gisela 1; Technische Informationsbibliothek (TIB); Technische Informationsbibliothek (TIB); Herberger, David [Hrsg.]; Hübner, Marco [Hrsg.]
1 Institut für Produktionstechnik (WBK), Karlsruher Institut für Technologie (KIT)

Abstract:

Large Language Models (LLMs) open new opportunities for adaptive automation in production systems by enabling robots to interpret human instructions and generate context-aware actions. In contrast to conventional robot programming, which requires expert knowledge and frequent reconfiguration, LLM-based control promises greater flexibility and easier interaction between humans and machines. However, generic LLMs still face major challenges when applied to manufacturing environments, as they lack grounding in real-world perception and may produce infeasible or unsafe actions. This paper presents a laboratory demonstrator that evaluates how different prompting strategies affect the performance of an LLM-controlled pick-and-place robot. The study systematically compares zero-shot and multimodal few-shot prompting, where visual examples such as annotated video frames and image captions are integrated into the LLM input. A dedicated evaluation model with metrics for plan success, action success, and plan optimality is used to quantify system behavior. The experimental results demonstrate that multimodal few-shot prompting significantly improves planning accuracy, robustness, and adaptability compared to a zero-shot baseline. ... mehr


Verlagsausgabe §
DOI: 10.5445/IR/1000193181
Veröffentlicht am 13.05.2026
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Produktionstechnik (WBK)
Publikationstyp Proceedingsbeitrag
Publikationsjahr 2026
Sprache Englisch
Identifikator ISSN: 2701-6277
KITopen-ID: 1000193181
Erschienen in Proceedings of the Conference on Production Systems and Logistics: CPSL 2026
Veranstaltung 8th Conference on Production Systems and Logistics (CPSL 2026), Porto, Portugal, 14.04.2026 – 17.04.2026
Verlag Offenburg : publish-Ing.
Vorab online veröffentlicht am 16.04.2026
Schlagwörter 600 | Technik, Large Language Models (LLMs), Adaptive Production Systems, Multimodal Few-Shot Prompting, Intelligent Robot Control, Human-Interpretable Automation
Nachgewiesen in OpenAlex
KIT – Die Universität in der Helmholtz-Gemeinschaft
KITopen Landing Page