Project Details
Projekt Print View

Development of a computational approach to accurately detect gene losses in genome sequences

Subject Area Bioinformatics and Theoretical Biology
Term from 2015 to 2020
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 281458164
 
Final Report Year 2020

Final Report Abstract

A key question in genetics and evolutionary biology is: What exactly are the DNA changes that are involved in phenotypic differences between the remarkably diverse species on earth? The overarching goal of this project was to investigate this question by focusing on one important class of genomic differences: the loss of ancestral genes. To systematically address the role of gene loss for phenotypic evolution, we developed in this project a computational approach to accurately detect the inactivation (loss) of protein-coding genes. Key to achieve a very high accuracy was the development of CESAR, a new Hidden Markov Model based method that realigns coding exons while taking splice site and reading frame information into account. CESAR is not only key to detect lost genes, but also enables a highly-accurate and scalable annotation of conserved genes, which we demonstrated by projecting human genes to 120 other mammals and 70 non-mammal vertebrate genomes. Our final gene loss detection pipeline achieves a specificity of 99.7%. We used this approach to generate comprehensive gene loss catalogs for more than 60 placental mammals. Analyzing these resources led to a number of novel insights into the link between genomic (gene loss) and phenotypic changes. For example, we discovered numerous lineage-specific as well as convergent gene losses that provide mechanistic explanations for well-known examples of mammalian adaptations, such as life in water, extreme diving abilities, or specializations to herbivorous, carnivorous or frugivorous diets. Even though one would intuitively expect that gene loss is typically maladaptive, our results suggest that gene loss as an evolutionary mechanism for adaptation is more widespread than previously thought. Our analyses also informed trait evolution by showing that ketogenesis (a metabolic process that converts fatty acids into fuel usable by the brain) is not essential for the evolution of large mammalian brains. Furthermore, we showed that gene losses as molecular vestiges can resolve the controversially-discussed ancestry of soft-tissue traits like testicular descent in mammals, demonstrating that genomic analysis can also provide novel insights into evolutionary history. Finally, our gene loss catalogs highlight numerous mammals that are “natural knockouts” for genes implicated in human disease, yet in several cases deleterious disease phenotypes do not manifest in the respective species, which indicates that other genes or alternative mechanisms in these mammals may be able to substitute for the function of the disease-associated gene. Our results have been published in 16 publications, and were featured by news outlets including the New York Times (https://www.nytimes.com/2018/06/29/science/descending-testicles-evolution.html, https://www.nytimes.com/2019/09/26/science/whales-dolphins-genes-evolution.html).

Publications

  • Coding exon-structure aware realigner (CESAR) utilizes genome alignments for accurate comparative gene annotation. Nucleic Acids Res 2016, 44:e103
    Sharma V, Elghafari A, Hiller M
    (See online at https://doi.org/10.1093/nar/gkw210)
  • A genomics approach reveals insights into the importance of gene losses for mammalian adaptations. Nat Commun 2018, 9:1215
    Sharma V, Hecker N, Roscito JG, Foerster L, Langer BE, Hiller M
    (See online at https://doi.org/10.1038/s41467-018-03667-1)
  • Loss of RXFP2 and INSL3 genes in Afrotheria shows that testicular descent is the ancestral condition in placental mammals. PLoS Biol 2018, 16:e2005293
    Sharma V, Lehmann T, Stuckas H, Funke L, Hiller M
    (See online at https://doi.org/10.1371/journal.pbio.2005293)
  • Recurrent loss of HMGCS2 shows that ketogenesis is not essential for the evolution of large mammalian brains. Elife 2018, 7:e38906
    Jebb D, Hiller M
    (See online at https://doi.org/10.7554/eLife.38906)
  • Losses of human disease-associated genes in placental mammals. NAR Genomics and Bioinformatics 2019, 2:lqz012
    Sharma V, Hiller M
    (See online at https://doi.org/10.1093/nargab/lqz012)
  • Convergent losses of TLR5 suggest altered extracellular flagellin detection in four mammalian lineages. Mol Biol Evol, 2020
    Sharma V, Walther F, Hecker N, Stuckas H, Hiller M
    (See online at https://doi.org/10.1093/molbev/msaa058)
 
 

Additional Information

Textvergrößerung und Kontrastanpassung