KIT | KIT-Bibliothek | Impressum | Datenschutz

Consensus Versus Expertise: A Case Study of Word Alignment with Mechanical Turk

Gao, Qin; Vogel, Stephan

Abstract:

Word alignment is an important preprocessing step for machine translation. The project aims at incorporating manual alignments from Amazon Mechanical Turk (MTurk) to help improve word alignment quality. As a global crowdsourcing service, MTurk can provide flexible and abundant labor force and therefore reduce the cost of obtaining labels. An easy-to-use interface is developed to simplify the labeling process. We compare the alignment results by Turkers to that by experts, and incorporate the alignments in a semi-supervised word alignment tool to improve the quality of the labels. We also compared two pricing strategies for word alignment task. Experimental results show high precision of the alignments provided by Turkers and the semi-supervised approach achieved 0.5% absolute reduction on alignment error rate.


Verlagsausgabe §
DOI: 10.5445/IR/1000166344
Veröffentlicht am 14.02.2024
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Anthropomatik und Robotik (IAR)
Publikationstyp Proceedingsbeitrag
Publikationsjahr 2010
Sprache Englisch
Identifikator KITopen-ID: 1000166344
Erschienen in Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk. Ed.: C. Callison-Burch, M. Dredze
Veranstaltung Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk (CSLDAMT 2010), Los Angeles, CA, USA, 06.06.2010
Verlag Association for Computational Linguistics (ACL)
Seiten 30-34
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page