KIT | KIT-Bibliothek | Impressum | Datenschutz

Unsupervised Vocabulary Selection for Domain-Independent Simultaneous Lecture Translation

Märgner, Paul; Kilgour, Kevin; Lane, Ian; Waibel, Alex

Abstract:

In this work, we investigate methods to automatically adapt our simultaneous lecture
translation systems to the diverse topics that occur in educational lectures. Utilizing ma-
terials that are available before the lecture begins, such as lecture slides, our proposed
framework iteratively searches for related documents on the World Wide Web and gen-
erates lecture-specific models and vocabularies based on the resulting documents. In
this paper, we propose a novel method for vocabulary selection, a critical aspect of si-
multaneous translation systems where the occurrence of out-of-vocabulary words signifi-
cantly degrades intelligibility. We propose a novel approach based on feature-based rank-
ing and evaluate the effectiveness of 21 different features and their combinations for this
task. On the interACT German-English simultaneous lecture translation system our pro-
posed approach significantly improved vocabulary coverage, reducing out-of-vocabulary
rate, on average by 60% and up to 84%, compared to a lecture-independent baseline. Fur-
thermore, a 40k vocabulary selected using our method obtained better coverage than a
... mehr


Verlagsausgabe §
DOI: 10.5445/IR/1000182354
Veröffentlicht am 12.06.2025
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Anthropomatik und Robotik (IAR)
Publikationstyp Proceedingsbeitrag
Publikationsjahr 2011
Sprache Englisch
Identifikator KITopen-ID: 1000182354
Erschienen in International Workshop on Spoken Language Translation (IWSLT 2011), San Francisco, 8th - 9th December 2011
Veranstaltung International Workshop on Spoken Language Translation (IWSLT 2011), San Francisco, CA, USA, 08.12.2011 – 09.12.2011
Verlag Association for Computational Linguistics (ACL)
Seiten 19-23
Externe Relationen Siehe auch
Siehe auch
KIT – Die Universität in der Helmholtz-Gemeinschaft
KITopen Landing Page