KIT | KIT-Bibliothek | Impressum | Datenschutz

Simulations of Sequence Evolution: How (Un)realistic They Are and Why

Trost, Johanna; Haag, Julia ORCID iD icon; Höhler, Dimitri; Jacob, Laurent; Stamatakis, Alexandros ORCID iD icon 1; Boussau, Bastien
1 Institut für Theoretische Informatik (ITI), Karlsruher Institut für Technologie (KIT)

Abstract:

Simulating multiple sequence alignments (MSAs) using probabilistic models of sequence evolution plays an important role in the evaluation of phylogenetic inference tools and is crucial to the development of novel learning-based approaches for phylogenetic reconstruction, for instance, neural networks. These models and the resulting simulated data need to be as realistic as possible to be indicative of the performance of the developed tools on empirical data and to ensure that neural networks trained on simulations perform well on empirical data. Over the years, numerous models of evolution have been published with the goal to represent as faithfully as possible the sequence evolution process and thus simulate empirical-like data. In this study, we simulated DNA and protein MSAs under increasingly complex models of evolution with and without insertion/deletion (indel) events using a state-of-the-art sequence simulator. We assessed their realism by quantifying how accurately supervised learning methods are able to predict whether a given MSA is simulated or empirical.


Verlagsausgabe §
DOI: 10.5445/IR/1000178364
Veröffentlicht am 22.01.2025
Originalveröffentlichung
DOI: 10.1093/molbev/msad277
Scopus
Zitationen: 10
Web of Science
Zitationen: 11
Dimensions
Zitationen: 18
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Theoretische Informatik (ITI)
Publikationstyp Zeitschriftenaufsatz
Publikationsmonat/-jahr 01.2024
Sprache Englisch
Identifikator ISSN: 0737-4038, 1537-1719
KITopen-ID: 1000178364
Erschienen in Molecular Biology and Evolution
Verlag Oxford University Press (OUP)
Band 41
Heft 1
Seiten Art.-Nr. msad277
Vorab online veröffentlicht am 20.12.2023
Nachgewiesen in OpenAlex
Web of Science
Scopus
Dimensions
KIT – Die Universität in der Helmholtz-Gemeinschaft
KITopen Landing Page