KIT | KIT-Bibliothek | Impressum | Datenschutz

Towards Accessible Visualizations with Vision-Language Models

Moured, Omar ORCID iD icon 1
1 Institut für Anthropomatik und Robotik (IAR), Karlsruher Institut für Technologie (KIT)

Abstract:

Data Visualizations such as charts/plots, and diagrams are multi-dimensional representations commonly used to explore data and communicate insights. They are available in diverse layouts and styles, each tailored to specific analytical needs. For example, with a smartwatch, one may monitor monthly sleeping cycles with a quick glance at a visual plot.

Unfortunately, it is estimated that in 2020, approximately 70\% of visual content existed in inaccessible modalities for readers with visual impairments. People sharing this content might lack the expertise to make their content accessible or fear the time and labor required to achieve such goals. On the other hand, People with Visual Impairment (PVI) might not be confident in the available assistive tools to enable them to interpret the content independently.

In this thesis, our research focuses on developing visual content analysis systems to assist sighted individuals to make their content accessible and to provide end-to-end access for PVI. More specifically, we investigate how to digitize documents and visuals while ensuring adherence to accessibility guidelines. To this end, we first investigate how to construct and convey layout information from deep learning models to tactile modalities. ... mehr


Volltext §
DOI: 10.5445/IR/1000177390
Veröffentlicht am 17.12.2024
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Anthropomatik und Robotik (IAR)
Publikationstyp Hochschulschrift
Publikationsdatum 17.12.2024
Sprache Englisch
Identifikator KITopen-ID: 1000177390
Verlag Karlsruher Institut für Technologie (KIT)
Umfang xvii, 107 S.
Art der Arbeit Dissertation
Fakultät Fakultät für Informatik (INFORMATIK)
Institut Institut für Anthropomatik und Robotik (IAR)
Prüfungsdatum 12.12.2024
Projektinformation INTUITIVE (EU, H2020, 861166)
Schlagwörter data Visualizations, charts, plots, diagrams, accessibility, visual impairments, assistive tools, visual content analysis systems, document digitization, layout, deep learning models, tactile materials, alternative text, llm, vision-language models, robustness, inclusivity, document analysis
Referent/Betreuer Stiefelhagen, Rainer
Prince, Enamul Hoque
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page