Contextual Semantic Mapping Dataset for Intralogistics: RGB Images, 2D LiDAR Scans, and SAM Instance Annotations

Rüdt, Marvin; Pang, Hao; Enke, Constantin; Furmans, Kai; Seibold, Zäzilia

doi:10.35097/qx7b62vnbercxzj9

Contextual Semantic Mapping Dataset for Intralogistics: RGB Images, 2D LiDAR Scans, and SAM Instance Annotations

Rüdt, Marvin

¹; Pang, Hao

¹; Enke, Constantin

¹; Furmans, Kai

¹; Seibold, Zäzilia ¹
¹ Institut für Fördertechnik und Logistiksysteme (IFL), Karlsruher Institut für Technologie (KIT)

Abstract:

Dieser Datensatz enthält synchronisierte Sensoraufnahmen eines mobilen Roboters in einer kontrollierten Intralogistik-Umgebung sowie zugehörige Instanzannotationen. Er umfasst 74 Frames, die nach Bewegungsschwellen (30 cm Translation bzw. 15° Rotation) aus einer Explorationsfahrt extrahiert wurden. Jeder Frame besteht aus einem entzerrten RGB-Bild (768 × 480), einem fusionierten 2D-Laserscan mit Punkt-zu-Pixel-Korrespondenzen sowie zwei Instanzsegmentierungs-Dateien: automatisch erzeugte, fusionierte SAM-Masken und manuell erstellte Ground-Truth-Masken. Die Umgebung enthält 18 Objektinstanzen aus 13 semantischen Klassen. ... mehr

Abstract (englisch):

This dataset provides synchronized sensor recordings of a mobile robot in a controlled intralogistics environment together with instance-level annotations. It comprises 74 frames extracted from an exploration run via motion thresholds (30 cm translation or 15° rotation). Each frame contains an undistorted RGB image (768 × 480), a fused 2D laser scan with point-to-pixel correspondences, and two instance-segmentation files: automatically generated, fused SAM masks and manually created ground-truth masks. The environment contains 18 object instances from 13 semantic classes. ... mehr

Externe Links

Download (RADAR4KIT)

Export

Statistiken

Seitenaufrufe: 18
seit 23.06.2026

Zugehörige Institution(en) am KIT	Institut für Fördertechnik und Logistiksysteme (IFL)
Publikationstyp	Forschungsdaten
Publikationsdatum	23.06.2026
Erstellungsdatum	31.03.2026
Identifikator	DOI: 10.35097/qx7b62vnbercxzj9 KITopen-ID: 1000194520
Lizenz	Creative Commons Namensnennung 4.0 International
Projektinformation	SeI_MoR (WM_BaWü, BW8_1222)
Schlagwörter	intralogistics; mobile robots; semantic mapping; instance segmentation; vision-language models; RGB-LiDAR fusion; open-vocabulary; SLAM; robot perception
Liesmich	Recording setup. Data was recorded with a mobile robot equipped with two 2D laser scanners (360° range) and a forward-facing RGB camera. Robot poses and the geometric map were obtained with GMapping (2D SLAM). RGB and laser observations are temporally synchronized, establishing a point-to-pixel correspondence between geometric and visual data. The full exploration run was subsampled by motion thresholds (30 cm / 15°) to 74 frames in a single controlled environment. Structure. The data is organized per frame (74 frames). Each frame provides four files: undistorted_image.png, laser.json, sam1_fine_fused_instances.json (automatically generated, fused SAM masks), and sam1_gt_instances.json (manually annotated ground-truth masks). Formats. Images: PNG, 768 × 480, RGB, lens-undistorted. Pixel origin top-left; u = column, v = row. laser.json: JSON object with points (list of {u, v, x, y, z, intensity}) and a Unix timestamp. u, v are image coordinates; x, y, z are positions in meters relative to the LiDAR frame (z = 0 for planar scans). Instance files: JSON list of instances. Each instance is defined by a segmantation mask in pixels (list of [u, v] mask coordinates). Software / reuse. All annotations are plain JSON and images are standard PNG; no proprietary software is required. Files can be parsed with any JSON library (e.g. Python json) and inspected with standard image tools or NumPy/OpenCV/Matplotlib. Pixel coordinates index directly into the corresponding undistorted image, and laser (u, v) values map laser returns into the same image plane. The dataset supports tasks such as instance segmentation, multi-view object association, geometric/semantic mapping, and benchmarking vision-language models for intralogistics perception. A detailed README.md (including the two evaluated VLM prompts) is included in the dataset.
Art der Forschungsdaten	Dataset
Nachgewiesen in	OpenAlex

Repository KITopen

Contextual Semantic Mapping Dataset for Intralogistics: RGB Images, 2D LiDAR Scans, and SAM Instance Annotations

Abstract:

Abstract (englisch):