Project Details
Projekt Print View

Methods and algorithms for haplotype-based linkage and association analysis of rare genetic variants

Subject Area Epidemiology and Medical Biometry/Statistics
Term from 2010 to 2022
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 179606676
 
Final Report Year 2022

Final Report Abstract

The aim of this DFG-funded project was to develop and implement a new method to jointly analyze linkage and association using arbitrary pedigree types as well as unrelated individuals. In the past, we have developed and continuously further improved the GENEHUNTER-MODSCORE (GHM) software package, the first implementation of MOD score analysis, in which the parametric LOD score is maximized with respect to the parameters of the trait model, i.e., the penetrances and the disease-allele frequency. Such a MOD score analysis is especially useful if the trait-model parameters cannot be specified prior to the analysis, which is often the case in the context of complex diseases. Within this DFG-funded project, we were able to finalize three studies concerning developmental advances in GHM as well as evaluations of the properties of the MOD score approach. Firstly, we developed and implemented a new algorithm for the calculation of the diseaselocus likelihood, which is often the most time-consuming step during a MOD score analysis. This led to a significant speed-up of GHM MOD score calculations, especially when performing extensive simulation studies. Secondly, we reviewed the problem of parameter estimation in pedigree analysis in general and specifically tried to evaluate the ability of the MOD score approach to estimate trait-model parameters. Given certain conditions, the traitmodel parameters can theoretically be estimated without bias in a MOD score linkage analysis. Based on our results, trait-model parameters were generally estimated with lower bias and variability with increasing pedigree complexity, especially for recessive and overdominant models. However, dominant and additive models could hardly be distinguished from each other, even when using 3-generation pedigrees. We also investigated the ability of the MOD score to detect genomic imprinting using affected sib pairs (ASPs) and affected half-sib pairs (AHSPs) with missing parental genotype data. Genomic imprinting means the dependence of an individual’s liability to develop a disease according to the parental origin of the mutated allele(s). Imprinting could clearly be detected for mixtures of mainly ASPs and only few AHSPs with the common parent of the imprinted sex, even though no parental genotypes were available. Thirdly, we proposed and evaluated a novel test statistic and quantification method for imprinting that is based on the MOD score (MOD-score based imprinting test, MOBIT). The MOBIT statistic is calculated as the difference between the MOD score accounting for imprinting (‘IMOD score’) and the MOD score not accounting for imprinting (‘MOD score’). A major problem in linkage-based imprinting testing is the confounding between imprinting and sex-specific recombination fractions. We thoroughly investigated the statistical properties of the MOBIT and the effect of confounding using extensive simulations. We also proposed and investigated a simulation procedure to obtain an empirical p value for the MOBIT and implemented it in a new version of GHM. Finally, we developed and implemented a new method for the joint analysis of linkage and association in GHM. Joint linkage and association (JLA) analysis combines two disease gene mapping strategies: linkage information gathered from families and association information gathered from populations. JLA analysis can increase mapping power, especially when the evidence for both linkage and association is low to moderate. Similarly, an association analysis based on haplotypes instead of single markers can increase mapping power when the association pattern is complex. Our new JLA method is an extension of the MOD score approach, which jointly estimates trait-model and association parameters. Association is modelled using marker-trait locus haplotypes of a single diallelic trait locus and up to three single nucleotide variants. Linkage information is extracted from additional possibly multi-allelic flanking markers. Optimization of model parameters is achieved utilizing the derivative-free optimization algorithm COBYLA. We investigated the statistical properties of our JLA implementation using extensive simulations, and we compared our approach to the singlemarker JLA test implemented in the popular PSEUDOMARKER software package. Because the null distribution of our JLA test is unknown, we implemented and evaluated a simulation routine, which readily allows for running the simulations in parallel. We demonstrated the validity of our JLA analysis implementation and identified scenarios with complex association patterns, for which haplotype-based tests outperformed the single-marker tests. Our new JLA-MOD score method has the potential to be a valuable gene mapping and characterization tool. It is particularly useful in situations where either linkage or association information alone provide insufficient power to identify disease-causing genetic variants.

Publications

  • Fast linkage analysis with MOD scores using algebraic calculation. Human Heredity 2014;78(3-4):179—94
    Markus Brugger and Konstantin Strauch
    (See online at https://doi.org/10.1159/000369065)
  • Estimation of trait-model parameters in a MOD score linkage analysis. Human Heredity 2016;82(3-4):103—39
    Markus Brugger, Susanne Rospleszcz, Konstantin Strauch
    (See online at https://doi.org/10.1159/000479738)
  • Properties and evaluation of the MOBIT - a novel linkage-based test statistic and quantification method for imprinting. Statistical Applications in Genetics and Molecular Biology 2019;18(4)
    Markus Brugger, Michael Knapp, Konstantin Strauch
    (See online at https://doi.org/10.1515/sagmb-2018-0025)
 
 

Additional Information

Textvergrößerung und Kontrastanpassung