KIT | KIT-Bibliothek | Impressum | Datenschutz

MuSaG: A Multimodal German Sarcasm Dataset with Full-Modal Annotations

Scott, Aaron Robert; Züfle, Maike ORCID iD icon 1; Niehues, Jan ORCID iD icon
1 Institut für Anthropomatik und Robotik (IAR), Karlsruher Institut für Technologie (KIT)

Abstract:

Sarcasm is a complex form of figurative language in which the intended meaning contradicts the literal one. Its
prevalence in social media and popular culture poses persistent challenges for natural language understanding,
sentiment analysis, and content moderation. With the emergence of multimodal large language models, sarcasm
detection extends beyond text and requires integrating cues from audio and vision. We present MuSaG, the first
German multimodal sarcasm detection dataset, consisting of 33 minutes of manually selected and human-annotated
statements from German television shows. Each instance provides aligned text, audio, and video modalities,
annotated separately by humans, enabling evaluation in unimodal and multimodal settings. We benchmark nine
open-source and commercial models, spanning text, audio, vision, and multimodal architectures, and compare their
performance to human annotations. Our results show that while humans rely heavily on audio in conversational
settings, models perform best on text. This highlights a gap in current multimodal models and motivates the use of
MuSaG for developing models better suited to realistic scenarios. ... mehr


Originalveröffentlichung
DOI: 10.63317/2dc7oajwrnmt
Zugehörige Institution(en) am KIT Institut für Anthropomatik und Robotik (IAR)
Publikationstyp Proceedingsbeitrag
Publikationsjahr 2026
Sprache Englisch
Identifikator ISBN: 978-2-493814-49-4
ISSN: 2522-2686
KITopen-ID: 1000194792
Erschienen in Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026) Hrsg.: Piperidis, Stelios; Bel, Núria; Heuvel, Henk van den; Ide, Nancy; Krek, Simon; Toral, Antonio
Veranstaltung 15th Language Resources and Evaluation Conference (LREC 2026), Palma de Mallorca, Spanien, 11.05.2026 – 16.05.2026
Verlag European Language Resources Association (ELRA)
Seiten 372–392
Serie Proceedings of the Language Resources and Evaluation Conference
Schlagwörter Sarcasm Detection, Multimodality, German Dataset
Nachgewiesen in OpenAlex
KIT – Die Universität in der Helmholtz-Gemeinschaft
KITopen Landing Page