Project Details
Bayesian Inference of Gene and Genome Duplication Underpinning Plant Terrestrialization
Applicant
Hengchi Chen, Ph.D.
Subject Area
Plant Genetics and Genomics
Evolution and Systematics of Plants and Fungi
Evolution and Systematics of Plants and Fungi
Term
since 2025
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 566819120
Plant terrestrialization represents one of the most pivotal evolutionary events, ushering in the origin of embryophytes which account for the most biomass on land. On land, the earliest land plants radiated and gave rise to a wide range of novel forms and functions. Gene and genome duplication (GGD) are impelling evolutionary forces for organisms’ adaptation and innovation. However, little is known about how GGD may have shaped the process of plant terrestrialization. In this project, I aim to unravel the evolutionary role of gene and genome duplication in shaping embryophyte genomes and propelling their conquest of land. To achieve my goal, I plan to pursue two synergistic objectives as follows. Objective 1: Development of a new suite of tools for Bayesian inference of gene and genome duplication. Previously, we demonstrated the feasibility of a Bayesian phylogenomic software named Whale in the inference of gene and genome duplication. Here, I will first thoroughly rewrite the software Whale from Julia to Python to enhance its extensibility and fully leverage the current calculation libraries. Next, I will introduce the Approximate Bayesian Computation algorithm to obtain the posterior samples on top of the original No-U-Turn Sampler which is effective but time-consuming. To account for uncertainties of divergence time, I will incorporate fossil calibrations and their associated prior age distributions into the posterior sampling process. Multiple categories of gene duplication and loss rates will be implemented to accommodate the varied selective constraints on gene families of different sizes, which are currently overlooked. Ultimately, I will develop a suite of tools for the fine-grained organization and visualization of the inferred duplication event for each individual gene family. Objective 2: Delineation of the evolutionary significance of gene and genome duplication during plant terrestrialization. I will first reconstruct a large-scale phylogeny of embryophytes, leveraging our established methods and datasets. Gene and genome duplication events across four categories of plant species, namely aquatic embryophytes, terrestrial embryophytes, aquatic algae and subaerial algae will then be inferred. To illuminate the evolutionary impact of polyploidy, I will identify the WGD-derived, preferentially retained gene duplicates and annotate their functions. The gene duplication of subaerial algae compared to aquatic algae and of terrestrial embryophytes compared to aquatic embryophytes will be delineated and functionally characterized to pinpoint which function, metabolite, or signaling pathway may need to be stiffened or co-opted to adapt to terrestrial habitats. The mode of gene duplication in different categories of species will also be delineated. Such that I can test whether there is a significant distinction between the gene duplication mode across aquatic embryophyte, terrestrial embryophyte, aquatic alga and subaerial alga genomes.
DFG Programme
WBP Position
