KIT | KIT-Bibliothek | Impressum | Datenschutz

Efficient optimization for bilingual sentence alignment based on linear regression

Zechner, K.; Vogel, S.; Waibel, A.

Abstract:

This paper presents a study on optimizing sen-tence pair alignment scores of a bilingual sen-tence alignment module. Five candidate scores based on perplexity and sentence length are introduced and tested. Then a linear regression model based on those candidates is proposed and trained to predict sentence pairs' alignment quality scores solicited from human subjects. Experiments are carried out on data automatically collected from Internet. The correlation between the scores generated by the linear regression model and the scores from human subjects is in the range of the in-ter-subject agreement score correlations. Pear-son's correlation ranges from 0.53 up to 0.72 in our experiments.


Verlagsausgabe §
DOI: 10.5445/IR/1000009723
Veröffentlicht am 15.07.2025
Originalveröffentlichung
DOI: 10.3115/1118905.1118920
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Theoretische Informatik (ITI)
Publikationstyp Proceedingsbeitrag
Publikationsmonat/-jahr 03.2003
Sprache Englisch
Identifikator ISBN: 1-932432-06-X
KITopen-ID: 1000009723
Erschienen in Proceedings of the HLT-NAACL 2003 Workshop on Building and Using Parallel Texts: Data Driven Machine Translation and Beyond
Veranstaltung HLT-NAACL Workshop: Building and Using Parallel Texts Data Driven Machine Translation and Beyond (2003), Edmonton, Kanada, 27.05.2003 – 01.06.2003
Verlag Association for Computational Linguistics (ACL)
Seiten 81-87
Externe Relationen Siehe auch
ResearchGate
KIT – Die Universität in der Helmholtz-Gemeinschaft
KITopen Landing Page