Project Details
Projekt Print View

Computationally tractable bootstrap for high-dimensional data

Subject Area Mathematics
Term since 2021
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 460867398
 
Computationally tractable but fully nonparametric bootstrap for a massive data scenario is developed and studied in the high-dimensional regime. Based on $n$ independent identically distributed $p$-dimensional observations where both, $n$ and $p$ may be large, we pursue the innovation of combining subsampling with suitable dimension reduction of the subsampled observations. This data reduction approach originates from the experience that in many situations, a suitably selected "representative subpopulation" of each datum already contains the essential statistical information for the problem under consideration. For statistics characterized by the spectrum of the population covariance matrix, we rigorously introduce the so-called representative subpopulation condition and investigate its validity in commonly used statistical models. The novel approach is accessible to distributed computation with subsequent averaging even in the high-dimensional regime, revealing a new data reduction based bootstrap which is computationally tractable for massive data sets.
DFG Programme Research Units
 
 

Additional Information

Textvergrößerung und Kontrastanpassung