Project Details
Stability and Solvability in Deep Learning
Applicant
Professor Dr. Felix Voigtlaender
Subject Area
Mathematics
Term
since 2021
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 448518204
One of the challenges of the current digital era is automated decision making, which is often based on classification. For example, in image classification the task of the computer is to identify which objects are shown in a given image. This is too complicated of a task to solve by explicitly writing a program. Therefore, one uses artificial intelligence approaches that learn to solve the given task based on a set of training samples. In the last decade, the cutting edge technique for this in fields like imagine classification and game intelligence has been Deep Learning, the heuristic adaptation of weights of large neural networks based on training samples. These weights determine the strength of the connection between the different neurons of the network. The standard procedure for training the network weights is to initialize the weights randomly and then iteratively adapt them by minimizing (via the method of steepest descent, more formally called gradient descent) the error made by the network on the given training samples.Empirically, this approach outperforms all classical classification methods. Formally though, the mathematical understanding of Deep Learning is far from complete. An example of a not fully understood phenomenon with significant practical impact is the instability of neural networks with respect to very small changes in the inputs. For instance, a network might correctly classify an image of a cat as depicting a cat but when a small perturbation (imperceptible to a human) is applied to the picture, the same network will classify it as depicting a dog.The goal of this project is to investigate and formalize this phenomenon, focusing on the following three points:1. Mathematically prove that the current state of the art training methods necessarily induce this instability as a side effect and mathematically study the most widely used algorithms for generating perturbations that lead to misclassification.2. Examine partially successful existing ideas to mitigate this instability. Propose a new training dogma with stability guarantees that do not affect the classification accuracy.3. Analyze the computability of this and other training methods. In full generality, the problem of minimizing the error on the training sample is not computable, however empirical evidence suggests that neural networks can be trained successfully using computers. This warrants a more precise investigation regarding what conditions guarantee computability.
DFG Programme
Independent Junior Research Groups
International Connection
Austria, Netherlands, United Kingdom
Cooperation Partners
Professor Dr. Sjoerd Dirksen; Professor Anders Christian Hansen, Ph.D.; Professor Dr. Philipp Petersen