KIT | KIT-Bibliothek | Impressum | Datenschutz

High Performance Construction of RecSplit Based Minimal Perfect Hash Functions

Bez, Dominik 1; Kurpicz, Florian 2; Lehmann, Hans-Peter ORCID iD icon 2; Sanders, Peter ORCID iD icon 2
1 Karlsruher Institut für Technologie (KIT)
2 Institut für Theoretische Informatik (ITI), Karlsruher Institut für Technologie (KIT)

Abstract:

A minimal perfect hash function (MPHF) bijectively maps a set S of objects to the first |S| integers. It can be used as a building block in databases and data compression. RecSplit [Esposito et al., ALENEX'20] is currently the most space efficient practical minimal perfect hash function. It heavily relies on trying out hash functions in a brute force way.
We introduce rotation fitting, a new technique that makes the search more efficient by drastically reducing the number of tried hash functions. Additionally, we greatly improve the construction time of RecSplit by harnessing parallelism on the level of bits, vectors, cores, and GPUs.
In combination, the resulting improvements yield speedups up to 239 on an 8-core CPU and up to 5438 using a GPU. The original single-threaded RecSplit implementation needs 1.5 hours to construct an MPHF for 5 Million objects with 1.56 bits per object. On the GPU, we achieve the same space usage in just 5 seconds. Given that the speedups are larger than the increase in energy consumption, our implementation is more energy efficient than the original implementation.


Verlagsausgabe §
DOI: 10.5445/IR/1000162136
Veröffentlicht am 13.09.2023
Originalveröffentlichung
DOI: 10.4230/LIPIcs.ESA.2023.19
Scopus
Zitationen: 3
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Theoretische Informatik (ITI)
Publikationstyp Proceedingsbeitrag
Publikationsjahr 2023
Sprache Englisch
Identifikator ISBN: 978-3-95977-295-2
ISSN: 1868-8969
KITopen-ID: 1000162136
Erschienen in 31st Annual European Symposium on Algorithms (ESA 2023), Schloss Dagstuhl - Leibniz-Zentrum für Informatik. Ed.: I. Gørtz
Veranstaltung 31st 31st Annual European Symposium on Algorithms (Part of ALGO 2023) (ESA 2023), Amsterdam, Niederlande, 04.09.2023 – 08.09.2023
Verlag Schloss Dagstuhl - Leibniz-Zentrum für Informatik (LZI)
Seiten 19:1-19:16
Serie Leibniz international proceedings in informatics : LIPIcs / Schloss Dagstuhl Leibniz-Zentrum für Informatik ; 274
Schlagwörter compressed data structure, parallel perfect hashing, bit parallelism, GPU, SIMD, parallel computing, vector instructions, Theory of computation → Data compression, Information systems → Point lookups
Nachgewiesen in Scopus
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page