Project Details
Deep learning-based collective variable identification and surrogate modeling of stochastic dynamics in biophysics
Applicant
Wei Zhang, Ph.D.
Subject Area
Mathematics
Term
since 2023
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 524086759
Understanding the essential dynamics of molecular systems on large time-scales is one of the most important goals of many research studies in biophysics. Among numerous existing methodological approaches, collective variable (CV) identification and surrogate modeling have proven to be very promising for achieving this goal. Nevertheless, despite the significant progresses that have been made, there are fundamental challenges in these two approaches, which hinder their use in concrete applications. On the one hand, although various CV identification methods are available, rational approaches that are guaranteed to find suitable CVs for the study of molecular kinetics are still unavailable. A systematic way of determining the proper number of CVs is yet to be developed. In view of the recent developments of CV identification methods based on autoencoders, theoretical understanding on how to incorporate dynamical information in training autoencoders needs to be further improved. On the other hand, while deep-learned CVs are used in building Markov state models (MSMs), linear (Koopman) models, and various machine learning-based models, they have not been applied to building surrogate models described by stochastic differential equations (SDEs), despite the advantages of SDE models. In fact, surrogate SDE models allow to sample trajectory ensemble with flexible time resolutions compared to discrete-time models such as MSMs and Koopman models. They retain physical interpretation (e.g. by having a gradient structure in form of a free energy), as compared to pure machine learning-based models, and they are convenient for model error analysis. This project will develop deep learning-based methods for CV identification and building surrogate SDE models of molecular systems, with the aim of bridging the above-mentioned research gaps between mathematical understanding, numerical algorithms, and applications in biophysics. The research will make a contribution to both theoretical and algorithmic developments in the area of deep learning in molecular dynamics. The new methods will be applied to studying protein-folding processes in biophysics.
DFG Programme
Research Grants