Preprocessed and high-dimensional data sets in discriminant analysis and classification

Applicants Professorin Dr. Sophie Langer; Professorin Dr. Angelika Rohde

Subject Area Mathematics

Term since 2021

Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 460867398

Project Description

This project studies the impact of data preprocessing and high-dimensional feature vectors on classification, regression and linear discriminant analysis, two prevalent challenges in modern data science. Privacypreserving preprocessing can weaken the influence of features on outcomes, while high-dimensional data often includes many weak predictors. We develop statistical methods suited for such settings, focusing on computationally efficient learning algorithms with provable convergence rates. Specifically, we focus on semiparametric binary regression and analyze (stochastic) gradient descent for (penalized) empirical risk minimization. Our analysis progresses from foundational high-dimensional linear and additive models to more complex neural network architectures.

DFG Programme Research Units

Subproject of FOR 5381: Mathematical Statistics in the Information Age - Statistical Efficiency and Computational Tractability

International Connection Austria

Partner Organisation Fonds zur Förderung der wissenschaftlichen Forschung (FWF)

Cooperation Partner Professor Dr. Lukas Steinberger

Servicenavigation

Hauptnavigation

Preprocessed and high-dimensional data sets in discriminant analysis and classification

Additional Information

Servicenavigation

Hauptnavigation

Preprocessed and high-dimensional data sets in discriminant analysis and classification

Additional Information

Textvergrößerung und Kontrastanpassung