Modeling Coarticulation in EMG-based Continuous Speech Recognition

Wand, Michael; Schultz, Tanja

doi:10.1016/j.specom.2009.12.002

Modeling Coarticulation in EMG-based Continuous Speech Recognition

Wand, Michael ¹; Schultz, Tanja ¹
¹ Karlsruher Institut für Technologie (KIT)

Abstract:

This paper discusses the use of surface electromyography for automatic speech recognition. Electromyographic signals captured at the facial muscles record the activity of the human articulatory apparatus and thus allow to trace back a speech signal even if it is spoken silently. Since speech is captured before it gets airborne, the resulting signal is not masked by ambient noise. The resulting Silent Speech Interface has the potential to overcome major limitations of conventional speech-driven interfaces: it is not prone to any environmental noise, allows to silently transmit confidential information, and does not disturb bystanders. We describe our new approach of phonetic feature bundling for modeling coarticulation in EMG-based speech recognition and report results on the EMG-PIT corpus, a multiple speaker large vocabulary database of silent and audible EMG speech recordings, which we recently collected. Our results on speaker-dependent and speaker-independent setups show that modeling the interdependence of phonetic features reduces the word error rate of the baseline system by over 33% relative. Our final system achieves 10% word error rate for the best-recognized speaker on a 101-word vocabulary task, bringing EMG-based speech recognition within a useful range for the application of silent speech interfaces.

KITopen-Download

Preprint

DOI: 10.5445/IR/1000026321

Veröffentlicht am 22.03.2018

Externe Links

Originalveröffentlichung
DOI: 10.1016/j.specom.2009.12.002

Scopus
Zitationen: 145

Web of Science
Zitationen: 114

Dimensions
Zitationen: 116

Export

Statistiken

Seitenaufrufe: 199
seit 03.05.2018

Downloads: 524
seit 20.05.2018

Zugehörige Institution(en) am KIT	Fakultät für Informatik – Institut für Anthropomatik (IFA)
Publikationstyp	Zeitschriftenaufsatz
Publikationsjahr	2010
Sprache	Englisch
Identifikator	ISSN: 0167-6393 urn:nbn:de:swb:90-263211 KITopen-ID: 1000026321
Erschienen in	Speech Communication
Verlag	North-Holland Publishing
Band	52
Heft	4
Seiten	341-353
Schlagwörter	EMG-based Speech Recognition, Silent Speech Interfaces, Phonetic Features
Nachgewiesen in	Scopus Web of Science Dimensions

Repository KITopen

Modeling Coarticulation in EMG-based Continuous Speech Recognition

Abstract: