Project Details
Neural network training via persistence of excitation
Applicant
Professor Dr.-Ing. Matthias Müller
Subject Area
Automation, Mechatronics, Control Systems, Intelligent Technical Systems, Robotics
Term
since 2025
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 535860958
As part of the research unit "Active Learning for Systems and Control (ALeSCo) - Data Informativity, Uncertainty, and Guarantees", this project develops and analyses novel methods for supervised training of neural networks (NNs) using systems and control theoretic concepts. The main objective is to use these control theoretic methods to show guaranteed generalization properties for the trained neural network in the sense that the difference between the outputs of the trained NN and of an NN with ideal weights is contained within specific error bounds. In particular, we leverage recent advances in the stability analysis of moving horizon estimation (MHE) as well as in data-based system representations using nonlinear variants of Willems’ fundamental lemma (FL). In both cases, informativity conditions on the training data must be satisfied, and active learning strategies will be developed to determine which data points to collect. The first approach pursued within this project is to employ MHE, which is a state estimation technique suitable for general nonlinear systems. Its use for NN training is enabled by the reformulation of the weight update equation and the NN output function as a dynamical system. Thus, estimating the states of this system results in determining a set of neural weights that appropriately fit the measured outputs. We will investigate suitable modifications to existing MHE schemes and recently developed robust stability proof techniques to make them applicable to NN training. This will result in the above described desired NN generalization guarantees. In turn, the FL provides a system representation method that allows to obtain every possible input-output trajectory of a dynamical system using only measured data and without the need of identifying the system’s model parameters. Since it is possible to express the input-output mapping of an NN as a dynamical system, similar ideas can be used to represent it using the training data and without explicitly assigning values to the neural weights. In order to achieve this data-based representation, the second approach pursued within this project is to develop novel nonlinear extensions of the FL that exploit the known structure of the NN to be trained. The developed training methods will allow to show generalization properties as long as the training data satisfies certain informativity conditions. In this project, we will analyze notions of persistence of excitation and other informativity conditions for the different training approaches and different network architectures. Moreover, active learning strategies will be developed to satisfy such informativity conditions in an efficient and effective manner. To improve the practical applicability of the proposed training methods, strategies to reduce their computational complexity will also be investigated. Finally, we will study the application of the developed methods to different benchmark systems.
DFG Programme
Research Units
Subproject of
FOR 5785:
Active Learning for Systems and Control (ALeSCo) -- Data Informativity, Uncertainty, and Guarantees
Co-Investigator
Dr. Victor Gabriel Lopez Mejia
