KIT | KIT-Bibliothek | Impressum | Datenschutz

Combined Search and Encoding for Seeds, with an Application to Minimal Perfect Hashing

Lehmann, Hans-Peter ORCID iD icon 1; Sanders, Peter ORCID iD icon 1; Walzer, Stefan ORCID iD icon 1; Ziegler, Jonatan 2; Benoit, Anne [Hrsg.]; Kaplan, Haim [Hrsg.]; Wild, Sebastian [Hrsg.]; Herman, Grzegorz [Hrsg.]
1 Institut für Theoretische Informatik (ITI), Karlsruher Institut für Technologie (KIT)
2 Karlsruher Institut für Technologie (KIT)

Abstract:

Randomised algorithms often employ methods that can fail and that are retried with independent randomness until they succeed. Randomised data structures therefore often store indices of successful attempts, called seeds. If $n$ such seeds are required (e.g., for independent substructures) the standard approach is to compute for each $i ∈ [n]$ the smallest successful seed $S_i$ and store $S = (S_1,…,S_n)$.
The central observation of this paper is that this is not space-optimal. We present a different algorithm that computes a sequence $S' = (S_1',…,S_n')$ of successful seeds such that the entropy of $S'$ undercuts the entropy of $S$ by $Ω(n)$ bits in most cases. To achieve a memory consumption of $OPT+εn$, the expected number of inspected seeds increases by a factor of $O(1/ε)$.
We demonstrate the usefulness of our findings with a novel construction for minimal perfect hash functions that, for $n$ keys and any $ε ∈ [n^{-3/7},1]$, has space requirement $(1+ε)OPT$ and construction time $O(n/ε)$. All previous approaches only support $ε = ω(1/log n)$ or have construction times that increase exponentially with $1/ε$. Our implementation beats the construction throughput of the state of the art by more than two orders of magnitude for $ε ≤ 3$%.


Verlagsausgabe §
DOI: 10.5445/IR/1000185753
Veröffentlicht am 15.10.2025
Originalveröffentlichung
DOI: 10.4230/LIPIcs.ESA.2025.109
Scopus
Zitationen: 2
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Theoretische Informatik (ITI)
Publikationstyp Proceedingsbeitrag
Publikationsdatum 01.10.2025
Sprache Englisch
Identifikator ISBN: 978-3-95977-395-9
KITopen-ID: 1000185753
Erschienen in 33rd Annual European Symposium on Algorithms (ESA 2025), Warschau, 15th-17th September 2025
Veranstaltung 33rd Annual European Symposium on Algorithms (ESA 2025), Warschau, Polen, 15.09.2025 – 17.09.2025
Verlag Schloss Dagstuhl - Leibniz-Zentrum für Informatik (LZI)
Serie Leibniz International Proceedings in Informatics (LIPIcs) ; 351
Schlagwörter Random Seed, Encoding, Bernoulli Process, Backtracking, Perfect Hashing, Theory of computation → Randomness, geometry and discrete structures, Theory of computation → Data compression, Theory of computation → Data structures design and analysis
Nachgewiesen in Scopus
KIT – Die Universität in der Helmholtz-Gemeinschaft
KITopen Landing Page