Multilevel Architectures and Algorithms in Deep Learning

Applicants Professor Dr. Roland Herzog; Professor Dr. Anton Schiela

Subject Area Mathematics

Term since 2021

Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 464103607

Project Description

The design of deep neural networks (DNNs) and their training is a central issue in machine learning. Progress in these areas is one of the driving forces for the success of these technologies. Nevertheless, tedious experimentation and human interaction is often still needed during the learning process to find an appropriate network structure and corresponding hyperparameters to obtain the desired behavior of a DNN. The strategic goal of the proposed project is to provide algorithmic means to improve this situation. Our methodical approach relies on well established mathematical techniques: identify fundamental algorithmic quantities and construct a-posteriori estimates for them, identify and consistently exploit an appropriate topological framework for the given problem class, establish a multilevel structure for DNNs to account for the fact that DNNs only realize a discrete approximation of a continuous nonlinear mapping relating input to output data. Combining this idea with novel algorithmic control strategies and preconditioning, we will establish the new class of adaptive multilevel algorithms for deep learning, which not only optimize a fixed DNN, but also adaptively refine and extend the DNN architecture during the optimization loop. This concept is not restricted to a particular network architecture, and we will study feedforward neural networks, ResNets, and PINNs as relevant examples. Our integrated approach will thus be able to replace many of the current manual tuning techniques by algorithmic strategies, based on a-posteriori estimates. Moreover, our algorithm will reduce the computational effort for training and also the size of the resulting DNN, compared to a manually designed counterpart, making the use of deep learning more efficient in many aspects. Finally, in the long run our algorithmic approach has the potential to enhance the reliability and interpretability of the resulting trained DNN.

DFG Programme Priority Programmes

Subproject of SPP 2298: Theoretical Foundations of Deep Learning