KIT | KIT-Bibliothek | Impressum | Datenschutz

Scalable methods for analyzing and visualizing phylogenetic placement of metagenomic samples

Czech, L.; Stamatakis, A. ORCID iD icon 1
1 Institut für Theoretische Informatik (ITI), Karlsruher Institut für Technologie (KIT)

Abstract:

Background: The exponential decrease in molecular sequencing cost generates unprecedented amounts of data. Hence, scalable methods to analyze these data are required. Phylogenetic (or Evolutionary) Placement methods identify the evolutionary provenance of anonymous sequences with respect to a given reference phylogeny. This increasingly popular method is deployed for scrutinizing metagenomic samples from environments such as water, soil, or the human gut.
Novel methods: Here, we present novel and, more importantly, highly scalable methods for analyzing phylogenetic placements of metagenomic samples. More specifically, we introduce methods for (a) visualizing differences between samples and their correlation with associated meta-data on the reference phylogeny, (b) clustering similar samples using a variant of the k-means method, and (c) finding phylogenetic factors using an adaptation of the Phylofactorization method. These methods enable to interpret metagenomic data in a phylogenetic context, to find patterns in the data, and to identify branches of the phylogeny that are driving these patterns.
Results: To demonstrate the scalability and utility of our methods, as well as to provide exemplary interpretations of our methods, we applied them to 3 publicly available datasets comprising 9782 samples with a total of approximately 168 million sequences. ... mehr


Verlagsausgabe §
DOI: 10.5445/IR/1000095849
Veröffentlicht am 26.06.2019
Originalveröffentlichung
DOI: 10.1371/journal.pone.0217050
Scopus
Zitationen: 61
Web of Science
Zitationen: 56
Dimensions
Zitationen: 74
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Theoretische Informatik (ITI)
Publikationstyp Zeitschriftenaufsatz
Publikationsjahr 2019
Sprache Englisch
Identifikator ISSN: 1932-6203
KITopen-ID: 1000095849
Erschienen in PLOS ONE
Verlag Public Library of Science (PLoS)
Band 14
Heft 5
Seiten Art. Nr.: e0217050
Vorab online veröffentlicht am 28.05.2019
Nachgewiesen in Dimensions
Web of Science
Scopus
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page