KIT | KIT-Bibliothek | Impressum | Datenschutz

EMDC: A Semi-supervised Approach for Word Alignment

Gao, Qin; Vogel, Stephan

Abstract:

This paper proposes a novel semisupervised word alignment technique called EMDC that integrates discriminative and generative methods. A discriminative aligner is used to find high precision partial alignments that serve as constraints for a generative aligner which implements a constrained version of the EM algorithm. Experiments on small-size Chinese and Arabic tasks show consistent improvements on AER. We also experimented with moderate-size Chinese machine translation tasks and got an average of 0.5 point improvement on BLEU scores across five standard NIST test sets and four other test sets.

Zugehörige Institution(en) am KIT Institut für Anthropomatik und Robotik (IAR)
Publikationstyp Proceedingsbeitrag
Publikationsjahr 2010
Sprache Englisch
Identifikator KITopen-ID: 1000166343
Erschienen in Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010). Ed.: C.-R. Huang, D. Jurafsky
Veranstaltung 23rd International Conference on Computational Linguistics (COLING 2010), Beijing, China, 23.08.2010 – 27.08.2010
Verlag Tsinghua University Press
Seiten 349–357

Verlagsausgabe §
DOI: 10.5445/IR/1000166343
Veröffentlicht am 14.02.2024
Seitenaufrufe: 48
seit 14.02.2024
Downloads: 34
seit 24.02.2024
Cover der Publikation
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page