KIT | KIT-Bibliothek | Impressum | Datenschutz

The CMU Syntax-Augmented Machine Translation System: SAMT on Hadoop with N-best alignments

Zollmann, Andreas; Venugopal, Ashish; Vogel, Stephan

Abstract:

We present the CMU Syntax Augmented Machine Translation System that was used in the IWSLT-08 evaluation campaign. We participated in the Full-BTEC data track for Chinese-English translation, focusing on transcript translation. For this year’s evaluation, we ported the Syntax Augmented MT toolkit [1] to the Hadoop MapReduce [2] parallel processing architecture, allowing us to efficiently run experiments evaluating a novel “wider pipelines” approach to integrate evidence from N -best alignments into our translation models. We describe each step of the MapReduce pipeline as it is implemented in the open-source SAMT toolkit, and show improvements in translation quality by using N-best alignments in both hierarchical and syntax augmented translation systems.


Verlagsausgabe §
DOI: 10.5445/IR/1000166371
Veröffentlicht am 19.02.2024
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Anthropomatik und Robotik (IAR)
Publikationstyp Proceedingsbeitrag
Publikationsjahr 2008
Sprache Englisch
Identifikator KITopen-ID: 1000166371
Erschienen in Proceedings of the 5th International Workshop on Spoken Language Translation: Evaluation Campaign
Veranstaltung 5th International Workshop on Spoken Language Translation (IWSLT 2008), Honolulu, HI, USA, 20.10.2008 – 21.10.2008
Verlag Association for Computational Linguistics (ACL)
Seiten 18–25
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page