KIT | KIT-Bibliothek | Impressum | Datenschutz

Optimizing Rare Word Accuracy in Direct Speech Translation with a Retrieval-and-Demonstration Approach

Li, Siqi; Liu, Danni ORCID iD icon 1; Niehues, Jan ORCID iD icon 2
1 Institut für Anthropomatik und Robotik (IAR), Karlsruher Institut für Technologie (KIT)
2 Karlsruher Institut für Technologie (KIT)

Abstract:

Direct speech translation (ST) models often struggle with rare words. Incorrect translation of these words can have severe consequences, impacting translation quality and user trust. While rare word translation is inherently challenging for neural models due to sparse learning signals, real-world scenarios often allow access to translations of past recordings on similar topics. To leverage these valuable resources, we propose a retrieval-and-demonstration approach to enhance rare word translation accuracy in direct ST models. First, we adapt existing ST models to incorporate retrieved examples for rare word translation, which allows the model to benefit from prepended examples, similar to in-context learning. We then develop a cross-modal (speech-to-speech, speech-to-text, text-to-text) retriever to locate suitable examples. We demonstrate that standard ST models can be effectively adapted to leverage examples for rare word translation, improving rare word translation accuracy over the baseline by 17.6% with gold examples and 8.5% with retrieved examples. Moreover, our speech-to-speech retrieval approach outperforms other modalities and exhibits higher robustness to unseen speakers. ... mehr

Zugehörige Institution(en) am KIT Institut für Anthropomatik und Robotik (IAR)
Publikationstyp Proceedingsbeitrag
Publikationsdatum 12.11.2024
Sprache Englisch
Identifikator ISBN: 979-88-917616-4-3
KITopen-ID: 1000180187
Erschienen in Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. Y. Al-Onaizan, M. Bansal, Y.-N. Chen
Veranstaltung Conference on Empirical Methods in Natural Language Processing (EMNLP 2024), Miami, FL, USA, 12.11.2024 – 16.11.2024
Verlag Association for Computational Linguistics (ACL)
Seiten 12703 – 12719
Nachgewiesen in Dimensions
Scopus
OpenAlex

Verlagsausgabe §
DOI: 10.5445/IR/1000180187
Veröffentlicht am 20.03.2025
Seitenaufrufe: 20
seit 20.03.2025
Downloads: 9
seit 22.03.2025
Cover der Publikation
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page