Project Details
Projekt Print View

Elucidating Fingerprints – Towards a Holistic Explanatory Toolbox for Molecular Machine Learning

Subject Area Organic Molecular Chemistry - Synthesis and Characterisation
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Term since 2022
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 497089464
 
The central point of this proposal is the development of out-of-the-box for interpretable and Explainable Molecular Machine Learning on a structural level. Within this project broadly utilized molecular representations will be developed, adapted and used to train highly robust but accurate models (e.g. Gradient Boost algorithms). Starting from these models an open-source software pipeline will be employed to map feature importance, influence, interdependencies, as well as model confidences back to the molecular structure giving trained chemists a plain handle for molecular and reaction design. An important part of this work will involve the development of visualization based on analytic results that provide a high degree of accuracy on the one hand and are easy to understand for any scientist working in the field of molecular science on the other hand. Those tools shall be usable to investigate and improve underlaying datasets as well as for molecular design. In addition to the coloration and visualization of individual molecules, methods of statistical evaluation regarding the general influence of functional groups should be developed, so that rules for further reaction design can be derived. Finally, these rules should be used in the laboratory to validate the explanatory methods developed within the course of this proposal. By these objectives the proposal aims on fulfilling the following of the PPs general goals: “Application of state-of-the-art ML algorithms – Explainable AI”, “Development of (domain specific) molecular representations – Generally improved molecular representations” and “Prediction, understanding and interpretation of molecular properties – Improvement of current applications”. Within this scope a high focus lies on the interpretation and explanation models for quantitative yield prediction to find handles for a systematic improvement within this underdeveloped area of MML which also has defined as a major topic of this PP.
DFG Programme Priority Programmes
 
 

Additional Information

Textvergrößerung und Kontrastanpassung