Project Details
Projekt Print View

Analyzing convolutional neural networks with multiscale statistics

Subject Area Mathematics
Term from 2020 to 2022
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 438446671
 
Final Report Year 2021

Final Report Abstract

In this project I studied convolutional neural networks (CNNs) from a mathematical perspective. They are an important building block in some highly successful deep learning methods, but a rigorous analysis of their virtues and failures is to date not available. I analyzed U-nets, a special type of CNN, and applied them to solve inverse problems with unknown operator. In these problems, one has access to indirect information about an object, such as resonance signals in magnetic resonance imaging (MRI), and the task is to recover the object (image of an organ or tissue) given the signal. In some applications, the connection between object and signal, defined by an operator that maps the former to the latter, is not exactly known. This might be the case in complex or numerically intensive models, or if model parameters are unknown. In that situation, the problem of recovering the object is a hard task. In my work I showed that, if given many examples of objects and corresponding signals, a U-net is able to "learn" how to reconstruct the object even if the operator is unknown. I proved an explicit formula for the accuracy of the reconstruction, which becomes better for an increasing number of examples and increasing image quality. Moreover, my work gives explicit recommendations for how to choose the many hyperparameters of U-nets in order to achieve optimal performance, which is a difficult question in practice. This is to my knowledge the first time that such a result is proven for neural networks in the context of statistical inverse problems. But in addition to these particular results, my work also presents a methodology for analyzing deep neural nets in different contexts and applications, such as image segmentation and classification. In a nutshell, the main ideas are to treat the networks as operators, to relate them to well-studied objects, such as the wavelet transform or the wavelet-vaguelette decomposition, and to consider the covering numbers of the class of networks in an appropriate way. Beyond these theoretical results, I have also begun to apply these results to predictive maintenance in railway systems in collaboration with the German Aerospace Center (DLR, Braunschweig). Further, I envision to apply them to magnetic resonance imaging in collaboration with researchers in the University Health Network, Toronto.

Publications

 
 

Additional Information

Textvergrößerung und Kontrastanpassung