Multi-view Representation Learning for Unifying Languages, Knowledge and Vision

Mogadala, Aditya

doi:10.5445/IR/1000091950

Multi-view Representation Learning for Unifying Languages, Knowledge and Vision

Mogadala, Aditya

Abstract (englisch):

The growth of content on the web has raised various challenges, yet also provided numerous opportunities. Content exists in varied forms such as text appearing in different languages, entity-relationship graph represented as structured knowledge and as a visual embodiment like images/videos. They are often referred to as modalities. In many instances, the different amalgamation of modalities co-exists to complement each other or to provide consensus. Thus making the content either heterogeneous or homogeneous. Having an additional point of view for each instance in the content is beneficial for data-driven learning and intelligent content processing. However, despite having availability of such content. Most advancements made in data-driven learning (i.e., machine learning) is by solving tasks separately for the single modality. The similar endeavor was not shown for the challenges which required input either from all or subset of them.

In this dissertation, we develop models and techniques that can leverage multiple views of heterogeneous or homogeneous content and build a shared representation for aiding several applications which require a combination of modalities mentioned above. ... mehr

KITopen-Download

Volltext

DOI: 10.5445/IR/1000091950

Veröffentlicht am 12.03.2019

Export

Statistiken

Seitenaufrufe: 451
seit 12.03.2019

Downloads: 171
seit 12.03.2019

Zugehörige Institution(en) am KIT	Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)
Publikationstyp	Hochschulschrift
Publikationsjahr	2019
Sprache	Englisch
Identifikator	urn:nbn:de:swb:90-919508 KITopen-ID: 1000091950
Verlag	Karlsruher Institut für Technologie (KIT)
Umfang	IX, 163 S.
Art der Arbeit	Dissertation
Fakultät	Fakultät für Wirtschaftswissenschaften (WIWI)
Institut	Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)
Prüfungsdatum	07.11.2018
Schlagwörter	multi-view representation learning, cross-lingual word embeddings, cross-modal retrieval, image caption generation
Nachgewiesen in	OpenAlex
Globale Ziele für nachhaltige Entwicklung
Referent/Betreuer	Rettinger, A.

Repository KITopen

Multi-view Representation Learning for Unifying Languages, Knowledge and Vision

Abstract (englisch):