KIT | KIT-Bibliothek | Impressum | Datenschutz

What if? Emulative Simulation with World Models for Situated Reasoning

Liu, Ruiping 1; Chen, Yufan; Zhang, Yuheng; Zheng, Junwei 1; Peng, Kunyu ORCID iD icon 1; Wu, Chengzhi; Huang, Chenguang; Wen, Di ORCID iD icon 1; Zhang, Jiaming; Yang, Kailun 1; Stiefelhagen, Rainer ORCID iD icon 1
1 Institut für Anthropomatik und Robotik (IAR), Karlsruher Institut für Technologie (KIT)

Abstract:

Situated reasoning often relies on active exploration, yet in many real-world scenarios such exploration is infeasible due to physical constraints of robots or safety concerns of visually impaired users. Given only a limited observation, can an agent mentally simulate a future trajectory toward a target situation and answer spatial what-if questions? We introduce WanderDream, the first large-scale dataset designed for the emulative simulation of mental exploration, enabling models to reason without active exploration. WanderDream-Gen comprises 15.8K panoramic videos across 1,088 real scenes from HM3D, ScanNet++, and real-world captures, depicting imagined trajectories from current viewpoints to target situations. WanderDream-QA contains 158K question-answer pairs, covering starting states, paths, and end states along each trajectory to comprehensively evaluate exploration-based reasoning. Extensive experiments with world models and MLLMs demonstrate (1) that mental exploration is essential for situated reasoning, (2) that world models achieve compelling performance on WanderDream-Gen, (3) that imagination substantially facilitates reasoning on WanderDream-QA, and (4) that WanderDream data exhibit remarkable transferability to real-world scenarios. ... mehr


Volltext §
DOI: 10.5445/IR/1000192022
Veröffentlicht am 08.04.2026
Originalveröffentlichung
DOI: 10.48550/arXiv.2603.06445
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Anthropomatik und Robotik (IAR)
Publikationstyp Forschungsbericht/Preprint
Publikationsdatum 06.03.2026
Sprache Englisch
Identifikator KITopen-ID: 1000192022
Verlag arxiv
Serie Computer Science - Computer Vision and Pattern Recognition
Externe Relationen Siehe auch
Nachgewiesen in arXiv
OpenAlex
KIT – Die Universität in der Helmholtz-Gemeinschaft
KITopen Landing Page