KIT | KIT-Bibliothek | Impressum | Datenschutz

Supervised Adaptation of Sequence-to-Sequence Speech Recognition Systems using Batch-Weighting

Huber, Christian; Nguyen, Tuan-Nam; Song, Kaihang; Stüker, Sebastian; Hussain, Juan; Waibel, Alexander

Abstract:

When training speech recognition systems, one often faces the situation that sufficient amounts of training data for the language in question are available but only small amounts of data for the domain in question. This problem is even bigger for end-to-end speech recognition systems that only accept transcribed speech as training data, which is harder and more expensive to obtain than text data. In this paper we present experiments in adapting end-to-end speech recognition systems by a method which is called batch-weighting and which we contrast against regular fine-tuning, i.e., to continue to train existing neural speech recognition models on adaptation data. We perform experiments using theses techniques in adapting to topic, accent and vocabulary, showing that batch-weighting consistently outperforms fine-tuning. In order to show the generalization capabilities of batch-weighting we perform experiments in several languages, i.e., Arabic, English and German. Due to its relatively small computational requirements batch-weighting is a suitable technique for supervised life-long learning during the life-time of a speech recognition system, e.g., from user corrections.


Verlagsausgabe §
DOI: 10.5445/IR/1000166148
Veröffentlicht am 11.01.2024
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Anthropomatik und Robotik (IAR)
Publikationstyp Proceedingsbeitrag
Publikationsjahr 2020
Sprache Englisch
Identifikator KITopen-ID: 1000166148
Erschienen in Proceedings of the 2nd Workshop on Life-long Learning for Spoken Language Systems. Ed.: W. M. Campbell, A. Waibel, D. Hakkani-Tur, T. J. Hazen, K. Kilgour, E. Cho, V. Kumar, H. Glaude
Veranstaltung 2nd Workshop on Life-long Learning for Spoken Language Systems (2020), Online, 07.12.2020
Verlag Association for Computational Linguistics (ACL)
Seiten 9–17
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page