KIT | KIT-Bibliothek | Impressum | Datenschutz

DeFaktS: A German Dataset for Fine-Grained Disinformation Detection through Social Media Framing

Ashraf, Shaina; Bezzaoui, Isabel ORCID iD icon 1; Andone, Ionut; Markowetz, Alexander; Fegert, Jonas ORCID iD icon; Flek, Lucie
1 Institut für Wirtschaftsinformatik und Marketing (IISM), Karlsruher Institut für Technologie (KIT)

Abstract:

In today’s rapidly evolving digital age, disinformation poses a significant threat to public sentiment and socio-political dynamics. To address this, we introduce a new dataset “DeFaktS”, designed to understand and counter disinformation within German media. Distinctively curated across various news topics, DeFaktS offers an unparalleled insight into the diverse facets of disinformation. Our dataset, containing 105,855 posts with 20,008 meticulously labeled tweets, serves as a rich platform for in-depth exploration of disinformation’s diverse characteristics. A key attribute that sets DeFaktS apart is, its fine-grain annotations based on polarized categories. Our annotation framework, grounded in the textual characteristics of news content, eliminates the need for external knowledge sources. Unlike most existing corpora that typically assign a singular global veracity value to news, our methodology seeks to annotate every structural component and semantic element of a news piece, ensuring a comprehensive and detailed understanding. In our experiments, we employed a mix of classical machine learning and advanced transformer-based models. ... mehr


Verlagsausgabe §
DOI: 10.5445/IR/1000171383
Veröffentlicht am 18.06.2024
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Wirtschaftsinformatik und Marketing (IISM)
Publikationstyp Proceedingsbeitrag
Publikationsjahr 2024
Sprache Englisch
Identifikator KITopen-ID: 1000171383
Erschienen in Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). Ed.: N. Calzolari,
Veranstaltung Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), Torino, Italien, 20.05.2024 – 25.05.2024
Verlag ELRA Language Resources Association
Seiten 4580–4591
Externe Relationen Abstract/Volltext
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page