Project Details
GAMLSS for biostatistical regression modeling. Refinements and Further Developments
Applicant
Professor Dr. Matthias Schmid
Subject Area
Epidemiology and Medical Biometry/Statistics
Term
from 2012 to 2020
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 217090301
In view of the rapid development of personalized therapies and diagnostic tools, high-dimensional data analysis has gained considerable importance in biomedical research. A challenge for biostatistical method development is therefore to extract relevant information from various (possibly high-dimensional) data sources and to combine this information to give accurate statistical prediction rules. To this purpose, the first period of DFG Project was concerned with the development of AUC-optimized prediction rules for binary and censored clinical outcome variables.Because of the more and more detailed collection of clinical and anamnestic information in biomedical studies, and also because of the rapid development of technologies for generating molecular data, an increasingly important task for biostatisticians is to develop new methods that incorporate prior biological or clinical knowledge into statistical prediction rules. Consequently, there is a need for combining the data-driven methods developed in the first project period with the optimization of regression models that can be adapted to existing knowledge and information.A highly promising tool to address these issues is the GAMLSS methodology, which allows for the flexible modeling of a large variety of continuous and categorical outcome variables. For this reason GAMLSS methods are the main subject of the work program of the proposed second project period. In particular, they allow for the specification of flexible mean and variance structures according to prior knowledge. In addition, variable selection in GAMLSS is possible via the techniques developed in the first project period.Despite the many advantages of GAMLSS, there are several limitations that currently tend to prevent the broad use of GAMLSS in biomedical research. These limitations concern the construction of valid hypothesis tests in high-dimensional settings but also the non-consideration of uncertainty in the estimation of prognosis intervals via GAMLSS. Moreover, there are no extensions to multi-dimensional outcome variables. The main goal of the proposed project is therefore to analyze the aforementioned limitations of GAMLSS and to develop methods to solve the resulting problems.The evaluation of the newly developed methods will, in particular, comprise the analysis of clinical and epidemiological study data. This analysis will be based on current research and collaboration projects of the applicants. In addition, newly developed methods will be implemented and made available to users via open source software.
DFG Programme
Research Grants
Co-Investigator
Professor Dr. Olaf Gefeller