KIT | KIT-Bibliothek | Impressum | Datenschutz

Novel and Efficient Approaches for Exploring Phylogenetic Tree Space

Togkousidis, Anastasios ORCID iD icon 1
1 Institut für Theoretische Informatik (ITI), Karlsruher Institut für Technologie (KIT)

Abstract (englisch):

Inferring phylogenies under the Maximum Likelihood (ML) criterion is $\mathcal{NP}$-hard. Therefore, ML tree inference tools deploy heuristics to navigate through the vast tree space. For a given multiple sequence alignment (MSA), phylogenetic inference tools strive to determine the statistical model comprising the tree topology, its branch lengths, and the evolutionary model parameters, that maximize the phylogenetic likelihood function. The shape of the likelihood space, however, varies substantially across datasets. For MSAs with strong phylogenetic signal, the tree space exhibits a single likelihood peak, associated with a unique topology, to which most, if not all, independent ML tree searches rapidly converge. For other datasets, distinct searches may yield contradicting—yet equally likely or statistically indistinguishable—topologies, reflecting a rugged space with multiple local optima.

At the same time, MSA input sequences are inherently noisy. This increases the chances that ML tools may experience overfitting, whereby the best-found topology incorrectly models noise alongside the true phylogenetic signal. However, even if overfitting does not occur, the de facto presence of noise raises the question whether thorough optimization is justified. ... mehr


Volltext §
DOI: 10.5445/IR/1000194839
Veröffentlicht am 02.07.2026
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Theoretische Informatik (ITI)
Publikationstyp Hochschulschrift
Publikationsdatum 02.07.2026
Sprache Englisch
Identifikator KITopen-ID: 1000194839
Verlag Karlsruher Institut für Technologie (KIT)
Umfang xv, 160 S.
Art der Arbeit Dissertation
Fakultät Fakultät für Informatik (INFORMATIK)
Institut Institut für Theoretische Informatik (ITI)
Prüfungsdatum 24.06.2026
Schlagwörter Phylogenetics, Maximum Likelihood, Heuristics, Early Stopping, Overfitting, Parallelization, NP-hardness
Referent/Betreuer Stamatakis, Alexandros
Anisimova, Maria
KIT – Die Universität in der Helmholtz-Gemeinschaft
KITopen Landing Page