Efficient optimization for bilingual sentence alignment based on linear regression

Zechner, K.; Vogel, S.; Waibel, A.

doi:10.3115/1118905.1118920

Efficient optimization for bilingual sentence alignment based on linear regression

Zechner, K.; Vogel, S.; Waibel, A.

Abstract:

This paper presents a study on optimizing sen-tence pair alignment scores of a bilingual sen-tence alignment module. Five candidate scores based on perplexity and sentence length are introduced and tested. Then a linear regression model based on those candidates is proposed and trained to predict sentence pairs' alignment quality scores solicited from human subjects. Experiments are carried out on data automatically collected from Internet. The correlation between the scores generated by the linear regression model and the scores from human subjects is in the range of the in-ter-subject agreement score correlations. Pear-son's correlation ranges from 0.53 up to 0.72 in our experiments.

KITopen-Download

Verlagsausgabe

DOI: 10.5445/IR/1000009723

Veröffentlicht am 15.07.2025

Externe Links

Originalveröffentlichung
DOI: 10.3115/1118905.1118920

Export

Statistiken

Seitenaufrufe: 70
seit 28.04.2018

Downloads: 17
seit 07.12.2025

Zugehörige Institution(en) am KIT	Institut für Theoretische Informatik (ITI)
Publikationstyp	Proceedingsbeitrag
Publikationsmonat/-jahr	03.2003
Sprache	Englisch
Identifikator	ISBN: 1-932432-06-X KITopen-ID: 1000009723
Erschienen in	Proceedings of the HLT-NAACL 2003 Workshop on Building and Using Parallel Texts: Data Driven Machine Translation and Beyond
Veranstaltung	HLT-NAACL Workshop: Building and Using Parallel Texts Data Driven Machine Translation and Beyond (2003), Edmonton, Kanada, 27.05.2003 – 01.06.2003
Verlag	Association for Computational Linguistics (ACL)
Seiten	81-87
Externe Relationen	ResearchGate Siehe auch

Repository KITopen

Efficient optimization for bilingual sentence alignment based on linear regression

Abstract: