KIT | KIT-Bibliothek | Impressum | Datenschutz

SpatialLogic-Bench: A Diagnostic Benchmark for Task-Oriented Spatiotemporal Reasoning

Yang, Xiaoda; Gao, Shenzhou; Wang, Can; Zhang, Jiahe; Tang, Menglan; Xue, Jingyang; Liu, Sheng 1; Zhang, Peijian ; Mu, Yao ; Yue, Xiangyu
1 Institut für Fördertechnik und Logistiksysteme (IFL), Karlsruher Institut für Technologie (KIT)

Abstract:

Vision-Language Models (VLMs) have made significant progress in static perception, but their ability to understand dynamic task-oriented reasoning remains unclear. Existing benchmarks mainly focus on static spatial relationships and lack systematic assessment of dynamic reasoning capabilities. To this end, we propose SpatialLogic-Bench, a novel benchmark designed to evaluate VLMs’ understanding of spatiotemporal logic and their ability to assess task progress. The benchmark assesses two critical capabilities: first, fine-grained visual discrimination to accurately perceive subtle physical changes between state frames; second, the logical capacity to connect these changes to task goals and judge whether they indicate progress. To mitigate temporal dependency biases, we introduce a dual-task paradigm, presenting image pairs in both chronological and reversed orders while keeping task descriptions consistent. We construct a multi-scale evaluation system by varying time intervals between frames: smaller intervals test the model's fine-grained perception, while larger intervals demand more sophisticated logical inference. Empirical evaluation reveals that most VLMs experience significant performance degradation on tasks presented in inverse chronological order, indicating an over-reliance on temporal cues rather than robust reasoning abilities. ... mehr


Download
Originalveröffentlichung
DOI: 10.1609/aaai.v40i23.39022
Zugehörige Institution(en) am KIT Institut für Fördertechnik und Logistiksysteme (IFL)
Publikationstyp Proceedingsbeitrag
Publikationsjahr 2026
Sprache Englisch
Identifikator ISSN: 2374-3468, 2159-5399
KITopen-ID: 1000192422
Erschienen in Proceedings of the AAAI Conference on Artificial Intelligence
Veranstaltung 40th AAAI Conference on Artificial Intelligence (2026), Singapur, Singapur, 20.01.2026 – 27.01.2026
Verlag Association for the Advancement of Artificial Intelligence (AAAI)
Seiten 19441 - 19449
Serie 40
Vorab online veröffentlicht am 14.03.2026
Externe Relationen Siehe auch
Nachgewiesen in Scopus
OpenAlex
KIT – Die Universität in der Helmholtz-Gemeinschaft
KITopen Landing Page