Project Details
Optimal actions and stopping in sequential learning
Subject Area
Mathematics
Term
since 2021
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 460867398
We advance the methodology for sequential statistical and learning methods, which are standard in modern data science. Typically, iterative optimization methods are applied to minimize a statistical criterion like empirical risk over a high-dimensional or even infinite-dimensional parameter space. Stopping these iterations early usually leads to implicit regularization. We develop data-driven stopping rules that come close to the optimal (oracle) iteration number and quantify the statistical loss in terms of explicit oracle inequalities. A focus lies on classification tasks and complex learning algorithms, in particular decision trees (CART) and (stochastic) gradient descent. For CART we also develop a more transparent and powerful analysis under isotonicity. Equally important in sequential learning are active sampling strategies, which we extend to the important setting of anisotropic function classes. Then regression and classification functions can be learned faster up to some level of accuracy by steering the sampling in a data-driven way towards the areas where the function exhibits larger variation - or that are close to the classification boundary. Overall, significant contributions to gain statistical and computational efficiency simultaneously are expected.
DFG Programme
Research Units
