Project Details
Preprocessed and high-dimensional data sets in discriminant analysis and classification
Subject Area
Mathematics
Term
since 2021
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 460867398
This project studies the impact of data preprocessing and high-dimensional feature vectors on classification, regression and linear discriminant analysis, two prevalent challenges in modern data science. Privacypreserving preprocessing can weaken the influence of features on outcomes, while high-dimensional data often includes many weak predictors. We develop statistical methods suited for such settings, focusing on computationally efficient learning algorithms with provable convergence rates. Specifically, we focus on semiparametric binary regression and analyze (stochastic) gradient descent for (penalized) empirical risk minimization. Our analysis progresses from foundational high-dimensional linear and additive models to more complex neural network architectures.
DFG Programme
Research Units
Subproject of
FOR 5381:
Mathematical Statistics in the Information Age - Statistical Efficiency and Computational Tractability
International Connection
Austria
Partner Organisation
Fonds zur Förderung der wissenschaftlichen Forschung (FWF)
Cooperation Partner
Professor Dr. Lukas Steinberger
