KIT | KIT-Bibliothek | Impressum | Datenschutz

Towards Automatic Parsing of Structured Visual Content through the Use of Synthetic Data

Schölch, Lukas; Steinhauser, Jonas; Beichter, Maximilian; Seibold, Constantin ORCID iD icon 1; Yang, Kailun 1; Knäble, Merlin 2; Schwarz, Thorsten; Mädche, Alexander 2; Stiefelhagen, Rainer ORCID iD icon 1
1 Institut für Anthropomatik und Robotik (IAR), Karlsruher Institut für Technologie (KIT)
2 Institut für Wirtschaftsinformatik und Marketing (IISM), Karlsruher Institut für Technologie (KIT)

Abstract:

Structured Visual Content (SVC) such as graphs, flow charts, or the like are used by authors to illustrate various concepts. While such depictions allow the average reader to better understand the contents, images containing SVCs are typically not machine-readable. This, in turn, not only hinders automated knowledge aggregation, but also the perception of displayed in-formation for visually impaired people. In this work, we propose a synthetic dataset, containing SVCs in the form of images as well as ground truths. We show the usage of this dataset by an application that automatically extracts a graph representation from an SVC image. This is done by training a model via common supervised learning methods. As there currently exist no large-scale public datasets for the detailed analysis of SVC, we propose the Synthetic SVC (SSVC) dataset comprising 12,000 images with respective bounding box annotations and detailed graph representations. Our dataset enables the development of strong models for the interpretation of SVCs while skipping the time-consuming dense data annotation. We evaluate our model on both synthetic and manually annotated data and show the transferability of synthetic to real via various metrics, given the presented application. ... mehr


Volltext §
DOI: 10.5445/IR/1000146801
Veröffentlicht am 01.02.2023
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Anthropomatik und Robotik (IAR)
Institut für Wirtschaftsinformatik und Marketing (IISM)
Publikationstyp Forschungsbericht/Preprint
Publikationsjahr 2022
Sprache Englisch
Identifikator KITopen-ID: 1000146801
Nachgewiesen in arXiv
Relationen in KITopen
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page