Project Details
Projekt Print View

Extent, speed, and causes of changes in the protein-coding gene repertoire of holometabolous insects (Insecta: Endopterygota)

Subject Area Evolutionary Cell and Developmental Biology (Zoology)
Evolution, Anthropology
Ecology and Biodiversity of Animals and Ecosystems, Organismic Interactions
Term from 2016 to 2018
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 315489364
 
Final Report Year 2019

Final Report Abstract

We studied the evolution of protein-coding genes and transposable elements across insects exploiting data from published and unpublished genomes, including those sequenced, assembled and annotated by us. We first assessed in how far automatically inferred gene models, which typically represent the basis for comparative genomic analyses, represent a reliable basis, compared to manually curated gene models, when studying gene structural parameters. For this purpose, we compared the structural properties of protein-coding genes of automatically annotated as well as of manually curated gene sets from seven insect species sequenced by the i5k initiative using the software COGNAT that we developed for this purpose and made publicly available. We show that the properties of automatically generated gene models and their manually curated substitutes do not differ extensively, and major trends regarding gene structure can congruently be recovered from both sets. Thus, large-scale comparisons of gene structure of automatically generated annotations of protein-coding genes appear to be justifiable. Analyzing predicted gene models inferred with the same methodology across genomes of comparable assembly quality, we studied structure properties, phylogenetic conservation, copy status, and domain structure of protein-coding genes across the phylogeny of insects. We found that the conservation classes (present in all clades [core], present only in sister species [cloud], present in all remaining clades [shell]) of protein-coding genes differ in gene structure and protein domain diversity and that differences regarding these characteristics even extend to the species-specific copy status of gene family members. The underlying mechanisms and driving forces remaining unclear, however. We also characterized transposable elements (TEs) across insects (and other groups of arthropods). TEs are a major component of metazoan genomes and are associated with a variety of mechanisms that shape genome architecture and evolution. However, despite the ever-growing number of insect genomes sequenced to date, our understanding of the diversity and evolution of insect TEs remains poor. Our analyses revealed that the insect TE repertoire contains TEs of almost every class previously described, and in some cases even TEs previously reported only from vertebrates and plants. Additionally, we identified a large fraction of unclassifiable TEs. We detected high variation in TE content, ranging from less than 6 % to more than 58 %, and a possible relationship between the content and diversity of TEs and the genome size. While most insect orders exhibit a characteristic TE composition, we also observed intraordinal differences. Overal, our findings shed light on common patterns and reveal lineage-specific differences in content and evolution of TEs in insects.

Publications

 
 

Additional Information

Textvergrößerung und Kontrastanpassung