Swarm v3: towards tera-scale amplicon clustering

Mahé, Frédéric ; Czech, Lucas; Stamatakis, Alexandros 1; Quince, Christopher; de Vargas, Colomban; Dunthorn, Micah; Rognes, Torbjørn; Birol, Inanc [Hrsg.]
1 Institut für Theoretische Informatik (ITI), Karlsruher Institut für Technologie (KIT)


Motivation: Previously we presented swarm, an open-source amplicon clustering programme that produces fine-scale molecular operational taxonomic units (OTUs) that are free of arbitrary global clustering thresholds. Here, we present swarm v3 to address issues of contemporary datasets that are growing towards tera-byte sizes.
Results: When compared with previous swarm versions, swarm v3 has modernized C++ source code, reduced memory footprint by up to 50%, optimized CPU-usage and multithreading (more than 7 times faster with default parameters), and it has been extensively tested for its robustness and logic.

