KIT | KIT-Bibliothek | Impressum | Datenschutz

Word Reordering in Statistical Machine Translation with a POS-Based Distortion Model

Rottmann, Kay; Vogel, Stephan

Abstract:

In this paper we describe a word reordering strategy for statistical machine translation that reorders the source side based on Part of Speech (POS) information. Reordering rules are learned from the word aligned corpus. Reordering is integrated into the decoding process by constructing a lattice, which contains all word reorderings according to the reordering rules. Probabilities are assigned to the different reorderings. On this lattice monotone decoding is performed. This reordering strategy is compared with our previous reordering strategy, which looks at all permutations within a sliding window. We extend reordering rules by adding context information. Phrase translation pairs are learned from the original corpus and from a reordered source corpus to better capture the reordered word sequences at decoding time. Results are presented for English → Spanish and German ↔ English translations, using the European Parliament Plenary Sessions corpus.


Verlagsausgabe §
DOI: 10.5445/IR/1000166399
Veröffentlicht am 22.02.2024
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Anthropomatik und Robotik (IAR)
Publikationstyp Proceedingsbeitrag
Publikationsjahr 2007
Sprache Englisch
Identifikator ISBN: 978-91-977095-0-7
ISSN: 1653-2325
KITopen-ID: 1000166399
Erschienen in Proceedings of the 11th International Conference on Theoretical and Methodological Issues in Machine Translation (TMI 2007). Ed.: A. Way, B. Gawronska
Veranstaltung International Conference on Theoretical and Methodological Issues in Machine Translation of Natural Language (2007), Skövde, Schweden, 07.09.2007 – 09.09.2007
Verlag University of Skövde
Seiten 171-180
Serie Skövde University Studies in Informatics ; 2007:1
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page