Project Details
Analysing Random Walk Insertions in Cuckoo Hash Tables and Practical Filter Data Structures
Applicant
Dr. Stefan Walzer
Subject Area
Theoretical Computer Science
Term
from 2021 to 2023
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 465963632
This project concerns space-efficient data structures using hashing. There are two parts.Part A: Since its first description by Pagh and Rodler, cuckoo hashing has spawned a rich set of variations that power efficient dictionaries and related data structures such as cuckoo filters.A satisfactory analysis of the expected insertion time into such dictionaries when using the natural random walk insertion strategy has, despite notable effort of several groups of researchers, not yet been achieved. This project will take another stab at proving that expected insertion times are O(1) in the relevant settings. If successful, the result would hopefully apply to important variations of cuckoo hashing and may inspire novel applications where researchers were previously discouraged by insufficient theoretical guarantees.A possible ingredient in a proof could be the use of the objective method, which has, through the work of Lelarge et al., found its way into the cuckoo hashing literature and succeeded in reducing some unwieldy combinatorial problems (related to thresholds) to much more manageable analytical problems.A first milestone could be the analysis of the random walk strategy on peelable configurations. These have become more relevant due to the recent discovery of distributions of hash functions that yield peelable configurations in high-density settings.Part B: The importance of filter data structures in networking and databases is hard to overstate. A novel retrieval data structure has recently been picked up by Facebook engineer Peter Dillinger, leading to the development of Ribbon filters as an alternative to (static) Bloom filters, enabling significant space savings at the expense of higher construction time.Dillinger's implementation has since been integrated into the widely used data storage engine RocksDB.In collaboration with Dillinger, a full analysis of Ribbon filters shall be pursued. This includes the first precise description of the data structure, a mathematical analysis of its asymptotic performance, an exploration of natural variations and a comparison with competing ideas in realistic settings.
DFG Programme
Research Grants