KIT | KIT-Bibliothek | Impressum | Datenschutz

Improving Named Entity Translation Combining Phonetic and Semantic Similarities

Huang, Fei; Vogel, Stephan; Waibel, Alex

Abstract:

This paper describes an approach to translate rarely occurring named entities (NE) by combining phonetic and semantic similarities. The phonetic similarity is estimated from a surface string transliteration model, and the semantic similarity is calculated from a context vector semantic model. Given a source (Chinese) NE and its context, this approach first generates queries in the target (English) language according to the context translation hypotheses, then searches for relevant documents from a target language corpus. Target NEs in retrieved documents are compared with the source NE based on their phonetic and contextual semantic similarities, and the bestmatched one is selected as the correct translation. Experiments show that this approach achieves 67% accuracy on translating rarely pccuring NEs, and consistently improves the translation quality on different tasks over a state-of-the-art statistical machine translation system.


Verlagsausgabe §
DOI: 10.5445/IR/1000166443
Veröffentlicht am 06.03.2024
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Anthropomatik und Robotik (IAR)
Publikationstyp Proceedingsbeitrag
Publikationsjahr 2004
Sprache Englisch
Identifikator KITopen-ID: 1000166443
Erschienen in Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, HLT-NAACL 2004, Boston, USA, 02-07 May 2004
Veranstaltung Human Language Technology Conference (HLT-NAACL 2004), Boston, MA, USA, 02.05.2004 – 07.05.2004
Verlag Association for Computational Linguistics (ACL)
Seiten 281–288
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page