Multilingual Acoustic Features for Porting Speech Recognition Systems to New Languages

Stüker, Sebastian


Linguists estimate the number of currently existing languages to be between 5,000 and 7,000. The fifteenth edition of the Ethnologue lists 7,299 languages. Only for a small fraction of these languages \emph{automatic speech recognition} (ASR) systems have been developed so far. Languages addressed are mainly those with either a large population of speakers, with sufficient economic funding, or with high political impact.

In order to be able to cover as many languages as possible, techniques have to be developed in order to rapidly port speech recognition systems to new languages in a cost efficient way. The techniques have to be able to be applied to the new language without the need for extensive linguistic or phonetic knowledge about the new language and without the need for large amounts of training materials. This is especially true for the vast number of less prevalent and under resourced languages in the world.

In the past, phoneme based, language independent acoustic models have been studied for bootstrapping an acoustic model in a new language. These acoustic models usually have seen multiple languages during training, and work under the assumption that phonemes are pronounced the same across languages. ... mehr

Zugehörige Institution(en) am KIT Institut für Theoretische Informatik (ITI)
Publikationstyp Proceedingsbeitrag
Publikationsjahr 2008
Sprache Englisch
Identifikator ISBN: 978-3-940046-90-1
KITopen-ID: 1000010157
Erschienen in Elektronische Sprachsignalverarbeitung. Tagungsband der 19. Konferenz, Frankfurt am Main, 8.-10. Sept. 2008. Hrsg.: A. Lacroix
Verlag TUDpress
Seiten 141-148
Serie Studientexte zur Sprachkommunikation ; 50
