KIT | KIT-Bibliothek | Impressum | Datenschutz
Open Access Logo
§
Verlagsausgabe
DOI: 10.5445/IR/1000089255
Veröffentlicht am 09.01.2019

Predicting wikipedia infobox type information using word embeddings on categories

Sack, Harald,; Biswas, Russa,; Koutraki, Maria

Abstract:
Wikipedia has emerged as the largest multilingual, web based general reference work on the Internet. A huge amount of human resources have been invested in the creation and update of Wikipedia articles which are ideally complemented by so-called infobox templates defining the type of the underlying article. It has been observed that the Wikipedia infobox type information is often incomplete and inconsistent due to various reasons. However, the Wikipedia infobox type information plays a fundamental role for the RDF type information of Wikipedia based Knowledge Graphs such as DBpedia. This stimulates the need of always having the correct and complete infobox type information. In this work, we propose an approach to predict Wikipedia infobox types by using word embeddings on categories of Wikipedia articles, and analyze the impact of using minimal information from the Wikipedia articles in the prediction process.


Zugehörige Institution(en) am KIT Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)
Publikationstyp Proceedingsbeitrag
Jahr 2018
Sprache Englisch
Identifikator ISSN: 1613-0073
URN: urn:nbn:de:swb:90-892556
KITopen-ID: 1000089255
Erschienen in 2018 EKAW Posters and Demonstrations Session, EKAW-PD 2018; Nancy; France; 12 November 2018 through 16 November 2018. Ed.: O. Corby
Verlag RWTH, Aachen
Seiten 29-32
Serie CEUR Workshop Proceedings ; 2262
Schlagworte Wikipedia, Infobox, Word Embeddings, Text Classification
Nachgewiesen in Scopus
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft KITopen Landing Page