Project Details
Projekt Print View

Accounting for spatial heterogeneity of parameters in the sequentially Markov coalescent process

Subject Area Mathematics
Term from 2015 to 2019
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 285412928
 
The sequentially Markov coalescent (SMC) is an approximation of the coalescent process with recombination enabling its application to whole genome data sets. The SMC model differs from the standard coalescent as it models the genealogy of a set of sequences spatially along the alignment rather than chronologically. In addition, the process of genealogy change along the genome is Markovian, allowing the use of hidden Markov models for inference of population genomic parameters. While the SMC models the coalescent in space, current models so far assume homogeneity of parameters along the genome. This assumption is clearly at odds with our knowledge of the biology of genomes, as mutation rate, recombination rate and effective population size are highly heterogeneous. SMC models have also been exclusively applied to higher eukaryotic species, essentially Primates. These species have very large genomes, for which the parameter heterogeneity is rather diluted. With next-generation sequencing data becoming increasingly affordable, population genomic data sets are being generated for species with smaller, more compact genomes. For these data sets, parameter heterogeneity can be much more extreme than for primate genomes. Such species include economically important fungal pathogens, which cannot be analyzed with current, over-simplistic models. In this project we propose an extension of current SMC models to account for stochastic processes along the genome. The spatial heterogeneity is modeled as a Markov process, which, when combined with the intrinsic Markov property of the SMC, results in a Markov-modulated sequentially Markov model. The project will establish the formal properties of such Markov-modulated SMC (MMSMC) analytically and using simulation procedures. Biological applications are proposed for both primate and fungal data sets.
DFG Programme Priority Programmes
International Connection Denmark
 
 

Additional Information

Textvergrößerung und Kontrastanpassung