KIT | KIT-Bibliothek | Impressum | Datenschutz

MorphisHash: Improving Space Efficiency of ShockHash for Minimal Perfect Hashing

Hermann, Stefan ORCID iD icon 1; Benoit, Anne [Hrsg.]; Kaplan, Haim [Hrsg.]; Wild, Sebastian [Hrsg.]; Herman, Grzegorz [Hrsg.]
1 Institut für Theoretische Informatik (ITI), Karlsruher Institut für Technologie (KIT)

Abstract:

A minimal perfect hash function (MPHF) maps a set of $n$ keys to unique positions $\{1, …, n\}$. Representing an MPHF requires at least $\log₂(e)≈ 1.443$ bits per key. ShockHash is a technique to construct an MPHF and requires just slightly more space. It gives each key two random candidate positions. If each key can be mapped to one of its two candidate positions such that there is exactly one key mapped to each position, then an MPHF is found. If not, ShockHash repeats the process with a new set of random candidate positions. ShockHash has to store how many repetitions were required and for each key to which of the two candidate positions it is mapped. However, when a given set of candidate positions can be used as MPHF then there is not only one but multiple ways of mapping the keys to one of their candidate positions such that the mapping results in an MPHF. This redundancy makes up for the majority of the remaining space overhead in ShockHash. In this paper, we present MorphisHash which almost completely eliminates this redundancy. Our theoretical result is that MorphisHash saves $Θ(\ln(n))$ bits in expectation compared to ShockHash. ... mehr


Verlagsausgabe §
DOI: 10.5445/IR/1000185755
Veröffentlicht am 15.10.2025
Originalveröffentlichung
DOI: 10.4230/LIPIcs.ESA.2025.9
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Theoretische Informatik (ITI)
Publikationstyp Proceedingsbeitrag
Publikationsdatum 01.10.2025
Sprache Englisch
Identifikator ISBN: 978-3-95977-395-9
KITopen-ID: 1000185755
HGF-Programm 46.21.02 (POF IV, LK 01) Cross-Domain ATMLs and Research Groups
Erschienen in 33rd Annual European Symposium on Algorithms (ESA 2025), Warsaw, September 15-17, 2025
Veranstaltung 33rd Annual European Symposium on Algorithms (ESA 2025), Warschau, Polen, 15.09.2025 – 17.09.2025
Verlag Schloss Dagstuhl - Leibniz-Zentrum für Informatik (LZI)
Serie Leibniz International Proceedings in Informatics (LIPIcs) ; 351
Schlagwörter compressed data structure, perfect hashing, random graph, pseudoforest, component, Theory of computation → Data compression, Information systems → Point lookups, Theory of computation → Bloom filters and hashing, Mathematics of computing → Random graphs
Nachgewiesen in Scopus
OpenAlex
Globale Ziele für nachhaltige Entwicklung Ziel 7 – Bezahlbare und saubere EnergieZiel 9 – Industrie, Innovation und Infrastruktur
KIT – Die Universität in der Helmholtz-Gemeinschaft
KITopen Landing Page