KIT | KIT-Bibliothek | Impressum | Datenschutz

Communication Efficient Checking of Big Data Operations

Hübschle-Schneider, Lorenz; Sanders, Peter

Abstract (englisch):
We propose fast probabilistic algorithms with low (i.e., sublinear in the input size) communication volume to check the correctness of operations in Big Data processing frameworks and distributed databases. Our checkers cover many of the commonly used operations, including sum, average, median, and minimum aggregation, as well as sorting, union, merge, and zip. An experimental evaluation of our implementation in Thrill (Bingmann et al., 2016) confirms the low overhead and high failure detection rate predicted by theoretical analysis.

Open Access Logo


Postprint §
DOI: 10.5445/IR/1000085585
Veröffentlicht am 01.08.2019
Coverbild
Zugehörige Institution(en) am KIT Institut für Theoretische Informatik (ITI)
Publikationstyp Proceedingsbeitrag
Jahr 2018
Sprache Englisch
Identifikator ISBN: 978-1-5386-4369-3
KITopen-ID: 1000085585
HGF-Programm 46.12.02 (POF III, LK 01)
Erschienen in 32nd IEEE International Parallel and Distributed Processing Symposium (IPDPS), Vancouver, BC, Canada, 21-25 May 2018
Verlag IEEE, Picataway (NJ)
Seiten 650–659
Schlagworte certifying algorithms, checking computation, communication efficiency, data-parallel, probabilistic algorithms
Nachgewiesen in Scopus
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page