KIT | KIT-Bibliothek | Impressum | Datenschutz

Benchmarking Diffusion Models for Machine Translation

Demirag, Yunus; Liu, Danni ORCID iD icon 1; Niehues, Jan ORCID iD icon 1
1 Institut für Anthropomatik und Robotik (IAR), Karlsruher Institut für Technologie (KIT)

Abstract (englisch):

Diffusion models have recently shown great potential on many generative tasks. In this work, we explore diffusion models for machine translation (MT). We adapt two prominent diffusion-based text generation models, Diffusion-LM and DiffuSeq, to perform machine translation. As the diffusion models generate non-autoregressively (NAR), we draw parallels to NAR machine translation models. With a comparison to conventional Transformer-based translation models, as well as to the Levenshtein Transformer, an established NAR MT model, we show that the multimodality problem that limits NAR machine translation performance is also a challenge to diffusion models. We demonstrate that knowledge distillation from an autoregressive model improves the performance of diffusion-based MT. A thorough analysis on the translation quality of inputs of different lengths shows that the diffusion models struggle more on long-range dependencies than other models.


Verlagsausgabe §
DOI: 10.5445/IR/1000169454
Veröffentlicht am 21.03.2024
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Anthropomatik und Robotik (IAR)
Publikationstyp Proceedingsbeitrag
Publikationsmonat/-jahr 03.2024
Sprache Englisch
Identifikator KITopen-ID: 1000169454
Erschienen in Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop
Veranstaltung 18th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2024), St. Julian's, Malta, 17.03.2024 – 22.03.2024
Seiten 313–324
Bemerkung zur Veröffentlichung Student Research Workshop, March 21-22, 2024
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page