KIT | KIT-Bibliothek | Impressum | Datenschutz

TriS: A Statistical Sentence Simplifier with Log-linear Models and Margin-based Discriminative Training

Bach, Nguyen; Gao, Qin; Vogel, Stephan; Waibel, Alex

Abstract:

We propose a statistical sentence simplification system with log-linear models. In contrast to state-of-the-art methods that drive sentence simplification process by hand-written linguistic rules, our method used a margin-based discriminative learning algorithm operates on a feature set. The feature set is defined on statistics of surface form as well as syntactic and dependency structures of the sentences. A stack decoding algorithm is used which allows us to efficiently generate and search simplification hypotheses. Experimental results show that the simplified text produced by the proposed system reduces 1.7 Flesch-Kincaid grade level when compared with the original text. We will show that a comparison of a state-of-the-art rule-based system (Heilman and Smith, 2010) to the proposed system demonstrates an improvement of 0.2, 0.6, and 4.5 points in ROUGE-2, ROUGE-4, and AveF10, respectively.


Verlagsausgabe §
DOI: 10.5445/IR/1000166338
Veröffentlicht am 09.02.2024
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Anthropomatik und Robotik (IAR)
Publikationstyp Proceedingsbeitrag
Publikationsjahr 2011
Sprache Englisch
Identifikator KITopen-ID: 1000166338
Erschienen in Proceedings of 5th International Joint Conference on Natural Language Processing. Ed.: H. Wang, D. Yarowsky
Veranstaltung 5th International Joint Conference on Natural Language Processing (IJCNLP 2011), Chiang Mai, Thailand, 08.11.2011 – 11.11.2011
Verlag Asian Federation of Natural Language Processing
Seiten 474–482
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page