Extent, speed, and causes of changes in the protein-coding gene repertoire of holometabolous insects (Insecta: Endopterygota)
Evolution, Anthropology
Ecology and Biodiversity of Animals and Ecosystems, Organismic Interactions
Final Report Abstract
We studied the evolution of protein-coding genes and transposable elements across insects exploiting data from published and unpublished genomes, including those sequenced, assembled and annotated by us. We first assessed in how far automatically inferred gene models, which typically represent the basis for comparative genomic analyses, represent a reliable basis, compared to manually curated gene models, when studying gene structural parameters. For this purpose, we compared the structural properties of protein-coding genes of automatically annotated as well as of manually curated gene sets from seven insect species sequenced by the i5k initiative using the software COGNAT that we developed for this purpose and made publicly available. We show that the properties of automatically generated gene models and their manually curated substitutes do not differ extensively, and major trends regarding gene structure can congruently be recovered from both sets. Thus, large-scale comparisons of gene structure of automatically generated annotations of protein-coding genes appear to be justifiable. Analyzing predicted gene models inferred with the same methodology across genomes of comparable assembly quality, we studied structure properties, phylogenetic conservation, copy status, and domain structure of protein-coding genes across the phylogeny of insects. We found that the conservation classes (present in all clades [core], present only in sister species [cloud], present in all remaining clades [shell]) of protein-coding genes differ in gene structure and protein domain diversity and that differences regarding these characteristics even extend to the species-specific copy status of gene family members. The underlying mechanisms and driving forces remaining unclear, however. We also characterized transposable elements (TEs) across insects (and other groups of arthropods). TEs are a major component of metazoan genomes and are associated with a variety of mechanisms that shape genome architecture and evolution. However, despite the ever-growing number of insect genomes sequenced to date, our understanding of the diversity and evolution of insect TEs remains poor. Our analyses revealed that the insect TE repertoire contains TEs of almost every class previously described, and in some cases even TEs previously reported only from vertebrates and plants. Additionally, we identified a large fraction of unclassifiable TEs. We detected high variation in TE content, ranging from less than 6 % to more than 58 %, and a possible relationship between the content and diversity of TEs and the genome size. While most insect orders exhibit a characteristic TE composition, we also observed intraordinal differences. Overal, our findings shed light on common patterns and reveal lineage-specific differences in content and evolution of TEs in insects.
Publications
-
(2017): Characterizing gene repertoires or discover your music. — N2 Science Communication Conference at the Museum für Naturkunde Berlin, Berlin (November 6–8)
Wilbrandt J, Misof B, Niehuis O
-
(2017): Data basis, tool choice, human review: influences on predicted protein-coding gene structure. — 110th Annual Conference of the German Zoological Society, University of Bielefeld, Bielefeld (September 12–15)
Wilbrandt J, Misof B, Panfilio K, Niehuis O
-
COGNATE: comparative gene annotation characterizer. BMC Genomics 18: 535
Wilbrandt J, Misof B, Niehuis O
-
Diversity and Evolution of the Transposable Element Repertoire in Insects. — Joint Congress on Evolutionary Biology, Montpellier (August 2018)
Petersen M, Armisén D, Gibbs RA, Hering L, Khila A, Mayer G, Richards S, Niehuis O, and Misof B
-
(2019): Diversity and evolution of the transposable element repertoire in arthropods with particular reference to insects. BMC Evolutionary Biology 19: 1
Petersen M, Armisen D, Gibbs Ra, Hering L, Khila A, M Ayer G, Richards S, Niehuis O, Misof B