KIT | KIT-Bibliothek | Impressum | Datenschutz

Vectorizing Database Column Scans with Complex Predicates

Willhalm, Thomas; Oukid, Ismail; Müller, Ingo; Faerber, Franz

Abstract (englisch):

The performance of the full table scan is critical for the overall performance of column-store database systems such as the SAP HANA database. Compressing the underlying column data format is both an advantage and a challenge, because it reduces the data volume involved in a scan on one hand and introduces the need for decompression during the scan on the other hand. In previous work [26] we have shown how to accelerate the column-scan with range predicates using SIMD instructions. In this paper, we present a framework for vectorized scans with more complex predicates. One important building block is the In-List predicate, where all rows whose values are contained in a given list of values are selected. While this seems to exhibit only little data parallelism on first sight, we show that a performant vectorized implementation is possible using the new Intel AVX2 instruction set. We also improve our previous algorithms by leveraging the increased vector-width. Finally in a detailed performance evaluation, we show the benefit of these optimizations and of the new instruction set: in almost all cases our scans needs less than one CPU cycle per row including scans with In-List predicate, leading to an overall throughput of 8 billion rows per second and more on a single core.

Zugehörige Institution(en) am KIT Institut für Theoretische Informatik (ITI)
Publikationstyp Proceedingsbeitrag
Publikationsjahr 2013
Sprache Englisch
Identifikator KITopen-ID: 1000097634
Erschienen in International Workshop on Accelerating Data Management Systems Using Modern Processor and Storage Architectures (ADMS 2013), Riva del Garda, Trento, I, August 26, 2013.
Seiten 1–12
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page