A Systematic Comparison of Phrase-Based, Hierarchical and Syntax-Augmented Statistical MT

Zollmann, Andreas; Venugopal, Ashish; Och, Franz; Ponte, Jay


Probabilistic synchronous context-free grammar (PSCFG) translation models define weighted transduction rules that represent translation and reordering operations via nonterminal symbols. In this work, we investigate the source of the improvements in translation quality reported when using two PSCFG translation models (hierarchical and syntax-augmented), when extending a state-of-the-art phrase-based baseline that serves as the lexical support for both PSCFG models. We isolate the impact on translation quality for several important design decisions in each model. We perform this comparison on three NIST language translation tasks; Chinese-to-English, Arabic-to-English and Urdu-to-English, each representing unique challenges.

DOI: 10.5445/IR/1000166379
Veröffentlicht am 15.02.2024
Zugehörige Institution(en) am KIT Institut für Anthropomatik und Robotik (IAR)
Publikationstyp Proceedingsbeitrag
Publikationsjahr 2008
Sprache Englisch
Identifikator KITopen-ID: 1000166379
Erschienen in Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008). Ed.: D. Scott, H. Uszkoreit
Veranstaltung 22nd International Conference on Computational Linguistics (COLING 2008), Manchester, Vereinigtes Königreich, 18.08.2008 – 22.08.2008
Verlag Association for Computational Linguistics (ACL)
Seiten 1145–1152
