KIT | KIT-Bibliothek | Impressum | Datenschutz

Learning a Log-Linear Model with Bilingual Phrase-Pair Features for Statistical Machine Translation

Zhao, Bing; Waibel, Alex

Abstract:

We propose a set of informative feature functions togheter with a log-linear model framework for bilingual phrase-pair extraction to improve phrase-based statistical machine translation. The base feature functions investigated are phrase length model, phrase-level centers' distortion, lexicon translation equivalence, bracketing constraints and word alignment links. Two generative models show strong baselines withe these base features, illustrating the effectiveness of the proposed feature functions. Strategies of extending the features and a log-linear model of them are proposed to effectively extract phrase-pars from parallel data. Experimental results of TIDES'03 Chinese-English small data track show improved translation qualities.


Verlagsausgabe §
DOI: 10.5445/IR/1000166422
Veröffentlicht am 28.02.2024
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Anthropomatik und Robotik (IAR)
Publikationstyp Proceedingsbeitrag
Publikationsjahr 2005
Sprache Englisch
Identifikator KITopen-ID: 1000166422
Erschienen in IJCNLP-05: Fourth SIGHAN Workshop on Chinese Language Processing. Proceedings of the Workshop. Ed.: C. Huang, G. Levow
Veranstaltung 4th Workshop on Chinese Language Processing (2005), Jeju Island, Korea, 14.10.2005 – 15.10.2005
Verlag Association for Computational Linguistics (ACL)
Seiten 79-86
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page