KIT | KIT-Bibliothek | Impressum | Datenschutz

Trans4Map: Revisiting Holistic Bird’s-Eye-View Mapping From Egocentric Images to Allocentric Semantics With Vision Transformers

Chen, Chang 1; Zhang, Jiaming ORCID iD icon 1; Yang, Kailun 1; Peng, Kunyu 1; Stiefelhagen, Rainer ORCID iD icon 1
1 Institut für Anthropomatik und Robotik (IAR), Karlsruher Institut für Technologie (KIT)

Abstract (englisch):

Humans have an innate ability to sense their surroundings, as they can extract the spatial representation from the egocentric perception and form an allocentric semantic map via spatial transformation and memory updating. However, endowing mobile agents with such a spatial sensing ability is still a challenge, due to two difficulties: (1) the previous convolutional models are limited by the local receptive field, thus, struggling to capture holistic long-range dependencies during observation; (2) the excessive computational budgets required for success, often lead to a separation of the mapping pipeline into stages, resulting the entire mapping process inefficient. To address these issues, we propose an end-to-end one-stage Transformer-based framework for Mapping, termed Trans4Map. Our egocentric-to-allocentric mapping process includes three steps: (1) the efficient transformer extracts the contextual features from a batch of egocentric images; (2) the proposed Bidirectional Allocentric Memory (BAM) module projects egocentric features into the allocentric memory; (3) the map decoder parses the accumulated memory and predicts the top-down semantic segmentation map. ... mehr


Zugehörige Institution(en) am KIT Institut für Anthropomatik und Robotik (IAR)
Publikationstyp Proceedingsbeitrag
Publikationsjahr 2023
Sprache Englisch
Identifikator KITopen-ID: 1000154534
HGF-Programm 46.24.01 (POF IV, LK 01) Applied TA: Digitalizat. & Automat. Socio-Technical Change
Erschienen in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, January 3-7, 2023
Veranstaltung IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2023), Waikoloa, HI, USA, 03.01.2023 – 07.01.2023
Seiten 4013-4022
Nachgewiesen in arXiv
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page