KIT | KIT-Bibliothek | Impressum | Datenschutz

Scalable Construction of Text Indexes with Thrill

Bingmann, Timo; Gog, Simon; Kurpicz, Florian

Abstract (englisch):
The suffix array is the key to efficient solutions for myriads of string processing problems in different application domains, like data compression, data mining, or bioinformatics. With the rapid growth of available data, suffix array construction algorithms have to be adapted to advanced computational models such as external memory and distributed computing. In this article, we present five suffix array construction algorithms utilizing the new algorithmic big data batch processing framework Thrill, which allows scalable processing of input sizes on distributed systems in orders of magnitude that have not been considered before.

Open Access Logo

Preprint §
DOI: 10.5445/IR/1000097631
Veröffentlicht am 10.10.2019
DOI: 10.1109/BigData.2018.8622171
Zitationen: 2
Zitationen: 1
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Theoretische Informatik (ITI)
Publikationstyp Proceedingsbeitrag
Publikationsmonat/-jahr 12.2018
Sprache Englisch
Identifikator ISBN: 978-1-5386-5035-6
KITopen-ID: 1000097631
HGF-Programm 46.12.02 (POF III, LK 01) Data Activities
Erschienen in 2018 IEEE International Conference on Big Data (Big Data 2018), Seattle, WA, December 10-13, 2018
Verlag Institute of Electrical and Electronics Engineers (IEEE)
Seiten 634–643
Nachgewiesen in Scopus
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page