Project Details
Projekt Print View

Integrating clinical and molecular patient data into subgroup risk prediction models for enabling individualized therapy

Subject Area Epidemiology and Medical Biometry/Statistics
Term from 2013 to 2017
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 243584364
 
Molecular measurements from patients, e.g. gene expression, promise to improve risk prediction for endpoints such as relapse or death in cancer. Furthermore, such high-dimensional measurements have been useful for identifying molecular subgroups by clustering techniques. Risk prediction models within these subgroups can provide a basis for individualized therapy, as prognosis is more specific to the individual patient. While some researches have manually built such subgroup risk prediction models, an objective, automated approach is still missing. This project will develop such an approach, addressing several hurdles: Clustering techniques have not been optimized for optimal performance of risk prediction models in identified subgroups, i.e. clustering needs to be coupled to prediction performance of risk prediction models. As the number of individuals in each subgroup will often be small, a subgroup model fitting approach is desirable that can borrow information from other subgroups. Clustering techniques as well as model fitting approaches will need to integrate patient characteristics alongside molecular measurements, as the former will often comprise established, pre-selected predictors. While there are some techniques available for making this distinction in model fitting, clustering approaches so far mostly focus on molecular measurements. Similarly, automated selection of a small list of molecular quantities important for clustering has only recently been addressed by subspace techniques. Combination of such techniques with model fitting might even allow to identify interactions, when linking a subgroup that is, e.g., characterized by a small set of genes to a risk prediction model that also uses only a small number of genes. As a core technique, the project will extend localized regression techniques, which still can give some weight to individuals from other subgroups when fitting a risk prediction model for a specific subgroup. This will be combined with established clustering techniques, before adapting more advanced subspace clustering approaches. Stability of identified molecular signatures will be an important optimization criterion besides prediction performance. Clustering approaches will be adapted to adequately integrate clinical and molecular measurements. Furthermore, gene group knowledge, e.g., corresponding to pathway membership, will be incorporated. The project will also extend clustering approaches for extracting information complementary to risk prediction models and for directly using signature similarity as a clustering criterion. Furthermore, logic regression will be adapted for obtaining subgroup and risk group descriptions in a post-processing step, which will be compared to approaches based on direct search for interactions. To optimize the approaches for real data, the project will specifically consider gene expression measurements from breast cancer patients and SNPs from an epidemiological cohort.
DFG Programme Research Grants
 
 

Additional Information

Textvergrößerung und Kontrastanpassung