The 2017 KIT IWSLT Speech-to-Text Systems for English and German

Nguyen, Thai-Son; Sperber, Sebastian; Zenkel, Thomas; Stüker, Sebastian; Müller, Markus; Waibel, Alex

The 2017 KIT IWSLT Speech-to-Text Systems for English and German

Nguyen, Thai-Son; Sperber, Sebastian; Zenkel, Thomas; Stüker, Sebastian; Müller, Markus; Waibel, Alex

Abstract:

This paper describes our German and English Speech-to-Text (STT) systems for the 2017 IWSLT evaluation campaign. The campaign focuses on the transcription of unsegmented lecture talks. Our setup includes systems using both the Janus and Kaldi frameworks. We combined the outputs using both ROVER [1] and confusion network combination (CNC) [2] to achieve a good overall performance. The individual subsystems are built by using different speaker-adaptive feature combination (e.g., lMEL with i-vector or bottleneck speaker vector), acoustic models (GMM or DNN) and speaker adaptation (MLLR or fMLLR). Decoding is performed in two stages, where the GMM and DNN systems are adapted on the combination of the first stage outputs using MLLR, and fMLLR. The combination setup produces a final hypothesis that has a significantly lower WER than any of the individual sub-systems. For the English lecture task, our best combination system has a WER of 8.3% on the tst2015 development set while our other combinations gained 25.7% WER for German lecture tasks.

KITopen-Download

Verlagsausgabe

DOI: 10.5445/IR/1000166205

Veröffentlicht am 17.01.2024

Export

Statistiken

Seitenaufrufe: 262
seit 17.01.2024

Downloads: 250
seit 23.01.2024

Zugehörige Institution(en) am KIT	Institut für Anthropomatik und Robotik (IAR)
Publikationstyp	Proceedingsbeitrag
Publikationsjahr	2017
Sprache	Englisch
Identifikator	KITopen-ID: 1000166205
Erschienen in	Proceedings of the 14th International Conference on Spoken Language Translation. Ed.: S. Sakti, M. Utiyama
Veranstaltung	14th International Workshop on Spoken Language Translation (IWSLT 2017), Tokio, Japan, 14.12.2017 – 15.12.2017
Verlag	Association for Computational Linguistics (ACL)
Seiten	60-64

Repository KITopen

The 2017 KIT IWSLT Speech-to-Text Systems for English and German

Abstract: