KIT | KIT-Bibliothek | Impressum | Datenschutz

Multi-view Representation Learning for Unifying Languages, Knowledge and Vision

Mogadala, Aditya

Abstract (englisch):

The growth of content on the web has raised various challenges, yet also provided numerous opportunities. Content exists in varied forms such as text appearing in different languages, entity-relationship graph represented as structured knowledge and as a visual embodiment like images/videos. They are often referred to as modalities. In many instances, the different amalgamation of modalities co-exists to complement each other or to provide consensus. Thus making the content either heterogeneous or homogeneous. Having an additional point of view for each instance in the content is beneficial for data-driven learning and intelligent content processing. However, despite having availability of such content. Most advancements made in data-driven learning (i.e., machine learning) is by solving tasks separately for the single modality. The similar endeavor was not shown for the challenges which required input either from all or subset of them.

In this dissertation, we develop models and techniques that can leverage multiple views of heterogeneous or homogeneous content and build a shared representation for aiding several applications which require a combination of modalities mentioned above. ... mehr


Volltext §
DOI: 10.5445/IR/1000091950
Veröffentlicht am 12.03.2019
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)
Publikationstyp Hochschulschrift
Publikationsjahr 2019
Sprache Englisch
Identifikator urn:nbn:de:swb:90-919508
KITopen-ID: 1000091950
Verlag Karlsruher Institut für Technologie (KIT)
Umfang IX, 163 S.
Art der Arbeit Dissertation
Fakultät Fakultät für Wirtschaftswissenschaften (WIWI)
Institut Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)
Prüfungsdatum 07.11.2018
Schlagwörter multi-view representation learning, cross-lingual word embeddings, cross-modal retrieval, image caption generation
Referent/Betreuer Rettinger, A.
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page