Project Details
Data Structures for Geometric Clustering Queries
Applicant
Professor Dr. Kevin Buchin
Subject Area
Theoretical Computer Science
Term
since 2025
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 550797858
Data structures for range searching are an important and fundamental research subject in Computational Geometry. Such data structures store geometric objects in a way that all objects within a given query range can be retrieved efficiently. In many applications of range searching, however, the actual aim of the range query is not the mere list of objects in the range, but to perform further analysis on the objects. An important example of such an analysis is clustering the objects, which is in particular of interest when there are many objects in the range. With the current algorithmic techniques, doing such an analysis within a query region is a two-stage process: first a range-searching data structure is used that reports all the data points lying inside the query region, and then an analysis algorithm is used that performs the analysis on these points. The latter is done from scratch for each query region. Since analysis algorithms are typically quite slow, this seriously hampers an efficient exploration. In this project we want to overcome this bottleneck by developing novel geometric range-analysis data structures that integrate the analysis into the query-answering algorithms. This general goal gives rise to a host of challenging problems, depending on the analysis task and on the type of geometric data at hand. To initiate the study of such range-analysis data structures we will focus on a core analysis task: geometric clustering, that is, grouping geometric objects based on their proximity. High-quality clustering is a time-intensive task, and therefore clustering from scratch for every query will typically be too slow. The objective of the proposed research is to develop geometric data structures that integrate range searching with clustering for points and curves, so that analyses on the data within a query region do not have to be done from scratch for each query. Our first goal is to develop such data structures for point sets. Within this goal both exact and approximate methods are of importance. For this, we plan to extend existing methods for range queries to the more complex problem of clustering. A challenge here is that clustering problems are not decomposable. Our second goal is to extend these range clustering data structures to sets of curves. This is even more challenging, as clustering for curves under a distance measure such as the Fréchet distance is more difficult than clustering points. Our third goal is to integrate handling of uncertainty and outliers into range clustering data structures. Here a challenge is that geometric uncertainty has so far mostly been studied in an algorithmic context but less in a data-structuring context. Our project presents a new view on preprocessing geometric data sets. It will lead to novel data structures for efficient localized clustering, and the obtained insights can also serve as base for other types of complex range queries and for improved algorithms for clustering in general.
DFG Programme
Research Grants
International Connection
Netherlands
Cooperation Partners
Professor Mark de Berg; Dr. Maarten Löffler
