Project Details
Recursive and sparse approximation in reinforcement learning with applications
Applicants
Privatdozent Dr. Christian Bayer; Professor Dr. Denis Belomestny; Professor Dr. Vladimir Spokoiny
Subject Area
Mathematics
Term
since 2022
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 497300407
Reinforcement learning (RL) is an integral part of Machine Learning concerned with controlling a system to maximize a performance measure that expresses a long-term objective. Reinforcement learning is of great interest because of the many practical applications, ranging from problems in artificial intelligence to operations research or financial mathematics. With recent breakthroughs in deep learning, deep reinforcement learning (DRL) demonstrates notable success in solving highly challenging problems. The DRL algorithms are compelling, but it remains an open issue how to relate the architecture of the networks involved to the structure of the underlying dynamic programming algorithm. Moreover, in DRL, the approximate dynamic programming algorithm involves solving a highly nonconvex statistical optimization problem. As an alternative to conventional deep neural network approximations in each backward step, one can construct a more problem-oriented nonlinear type approximation using information from the previous stages of the approximate dynamic programming algorithm. In this project, we aim at developing new types of recursive, sparse, and interpretable approximation methods for RL with provable theoretical guarantees. In particular, our objective is to present a new fitted Q-iteration algorithm with adaptively chosen approximation classes depending on previously constructed approximations. We shall compare this new approach to more conventional DRL algorithms regarding their theoretical guarantees and interpretability. Furthermore, we will extend our methods to the Mean-Field systems by combining our expertise on RL and McKean-Vlasov type processes. As a practical application we will provide an interpretation of the problem of consistent re-calibration of financial models as a RL problem, and study it using the methods developed in this project.
DFG Programme
Research Grants