Advances in Topological Data Analysis
Statistics and Econometrics
Final Report Abstract
The project is about advanced problems in statistical topological data analysis (TDA). The overarching goal is to understand the strengths of the TDA methodology and whether it can enable data scientists to make a more informed decision. The project consists of four subprojects. The first two of them study more fundamental problems in TDA and stochastic geometry in general whereas the third and fourth subproject treat more practical problems in the analysis of data objects. The first subproject is about the functional weak convergence of certain statistics in topological data analysis. We study the (functional) asymptotic normality and corresponding rates of convergence of the Euler characteristic for the Čech and Vietoris-Rips filtration in the thermodynamic regime. A further contribution establishes the asymptotic normality of a variety of test statistics derived from a tessellation-adapted refinement of the persistence diagram. The main results are established for Voronoi and Laguerre tessellations whose generators form a Gibbs point process. The second subproject establishes the Bahadur representation of sample quantiles for stabilizing score functionals in stochastic geometry and studies local fluctuations of the corresponding empirical distribution function. The scores are obtained from a Poisson process which is observed on a growing rectangular window in the Euclidean space. We apply the results to trimmed and Winsorized means of the score functionals and establish a law of the iterated logarithm for the sample quantiles of the scores. The third subproject investigates multivariate bootstrap procedures for general stabilizing statistics, with specific application to topological data analysis. The motivation is caused by the fact that existing limit theorems for topological statistics prove difficult to use in practice for the construction of confidence intervals. However, the standard nonparametric bootstrap does not directly provide for asymptotically valid confidence intervals in some situations. To this end, we develop a consistent smoothed bootstrap procedure. The fourth subproject studies statistical tests in topological data analysis. We study a persistent homology-based approach for the detection of change points in a weakly dependent sequence of persistent diagrams. The tests rely on stability results of commonly used filtration functions when the underlying point cloud experiences small perturbations. Another contribution develops two-sample tests for relevant differences in the (Fréchet) variances of persistence diagrams. Here, the tests rely on self-normalizing statistics and a functional central limit theorem for U-statistics of weakly dependent data. Beside these four TDA-related subprojects, another contribution has been achieved within the funding period of the project. It treats statistical inference for an intrinsic wavelet estimator of curves of symmetric positive definite (SPD) matrices in a log-Euclidean manifold.
Publications
-
On approximation theorems for the Euler characteristic with applications to the bootstrap. Electronic Journal of Statistics, 15(2).
Krebs, Johannes; Roycraft, Benjamin & Polonik, Wolfgang
-
Bootstrapping persistent Betti numbers and other stabilizing statistics. The Annals of Statistics, 51(4).
Roycraft, Benjamin; Krebs, Johannes & Polonik, Wolfgang
-
On the stability of the filtration functions for weakly dependent data with applications to structural break detection
Krebs, J. & Rademacher, D.
-
Persistent homology based goodness-of-fit tests for spatial tessellations. Journal of Nonparametric Statistics, 36(1), 39-59.
Hirsch, Christian; Krebs, Johannes & Redenbach, Claudia
-
On the Bahadur representation of sample quantiles for score functionals. Bernoulli, 30(4).
Krebs, Johannes
-
Statistical inference for wavelet curve estimators of symmetric positive definite matrices. Journal of Statistical Planning and Inference, 231, 106140.
Rademacher, Daniel; Krebs, Johannes & von, Sachs Rainer
-
Two-sample tests for relevant differences in persistence diagrams
Krebs, J. & Rademacher, D.
