KIT | KIT-Bibliothek | Impressum | Datenschutz

EMDC: A Semi-supervised Approach for Word Alignment

Gao, Qin; Vogel, Stephan

Abstract:

This paper proposes a novel semisupervised word alignment technique called EMDC that integrates discriminative and generative methods. A discriminative aligner is used to find high precision partial alignments that serve as constraints for a generative aligner which implements a constrained version of the EM algorithm. Experiments on small-size Chinese and Arabic tasks show consistent improvements on AER. We also experimented with moderate-size Chinese machine translation tasks and got an average of 0.5 point improvement on BLEU scores across five standard NIST test sets and four other test sets.


Verlagsausgabe §
DOI: 10.5445/IR/1000166343
Veröffentlicht am 14.02.2024
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Anthropomatik und Robotik (IAR)
Publikationstyp Proceedingsbeitrag
Publikationsjahr 2010
Sprache Englisch
Identifikator KITopen-ID: 1000166343
Erschienen in Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010). Ed.: C.-R. Huang, D. Jurafsky
Veranstaltung 23rd International Conference on Computational Linguistics (COLING 2010), Beijing, China, 23.08.2010 – 27.08.2010
Verlag Tsinghua University Press
Seiten 349–357
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page