Project Details
Projekt Print View

From Genomes to Macroevolution: Development of phylogenetic methods for gene tree, species tree and diversification rate estimation

Subject Area Bioinformatics and Theoretical Biology
Evolution, Anthropology
Term since 2018
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 401011686
 
Species richness varies considerably between different groups of species. A major question in macroevolution is what type of processes and factors have shaped present-day biodiversity. Specifically, we are interested if species-diversification rates have changed through time and/or among lineages and if these changes are correlated with extrinsic (e.g. environmental) or intrinsic (e.g. phenotypic) factors. In an ideal situation, we would have access to a complete fossil record documenting the exact biodiversity through time for every group. Unfortunately, the fossil record is anything but complete and alternative approaches are required. In this project I will develop mathematical and computational phylogenetic methods for species-diversification processes that consist of estimating time-calibrated phylogenies from genomic data (work programme 1 and 2) and inferring patterns of historical diversification (work programme 3). Today we have easy access to many published full genome sequences and can use this rich information to estimate phylogenies. However, genomic data are inherently heterogeneous, e.g. different gene histories and different rates of substitutions, and thus provide new challenges for phylogenetic analyses. My preliminary analyses show that gene-tree estimation is not robust when using state-of-the-art models. (1) In the first WP I focus on developing more complex and realistic substitution models as well as model adequacy testing for robust gene-tree estimation. (2) In the second WP I will develop new methods to estimate species trees from thousands of gene trees. Current methods use either crude approximations or can handle only very few loci for species-tree estimation. My approach will use importance sampling, which is equivalent to full-likelihood methods and can be implemented on large computer clusters. Hence, with my new methods we will be able to harness the full information contained in genomic data for phylogenetic inference. Then, after a species tree is obtained, I will use mathematical models to infer diversification rates through time and among lineages. This leads to my third WP which focuses on deriving appropriate mathematical models for diversification rate estimation. (3) Specifically, I will derive and implement models for both lineage-specific and character-dependent diversification rates. Together with my previous work on species-diversification rates through time, these models will enable us to test different hypotheses about which factor, extrinsic or intrinsic, has a larger impact on diversification. In summary, my three proposed work programmes will utilize existing genomic data to estimate species-trees while accounting for gene-tree discordance and heterogeneity of molecular evolution and then enable inference under complex species-diversification processes.
DFG Programme Independent Junior Research Groups
 
 

Additional Information

Textvergrößerung und Kontrastanpassung