Detecting footprint of selection in rye (Secale cereale L.) - unraveling the past for future crop improvement
Final Report Abstract
The effects of artificial selection on the rye genome were examined by the investigation of 523 target sequences in five winter rye populations and 15 inbred lines. The material reflected the rye breeding history of the seed parent gene pool from founder landraces to current elite material. Sequence capture probes (Roche/Nimbelgen) were designed for cDNA target sequences selected from an established EST resource based on their genetic map position matching loci for breeding relevant traits like frost tolerance, flowering time and selfincompatibility. In addition, sequences with homology to rice sequences putatively involved in grain quality were chosen. A hybrid sequencing approach of Roche’s 454 and Illumina’s HiSeq 2000 technology was subsequently applied yielding more than 40 Gbp of sequence data. These data were used to generate genomic reference sequences (gDNA) of the 523 target sequences. Sequencing data were mapped against the established gDNA reference sequences to detect SNPs between genotypes whereas genotype-specific assemblies were generated to determine the coverage of each target sequence and to calculate nucleotide diversity. The 10,320 detected SNPs were used to calculate estimates for molecular diversity within and among populations. Interestingly, no reduction in overall nucleotide and molecular diversity between populations was observed with increasing crop improvement suggesting a minor effect of crop improvement on genome-wide diversity. In contrast, inbred lines revealed as expected a reduced number of polymorphic loci and showed a reduction in molecular diversity compared to populations. This is likely the result of the use of smaller numbers of more closely related breeding lines and an intentional limited genetic exchange among heterotic groups in rye. The rye breeding history presented in this study showed that mostly existing common alleles in populations were used for crop improvement during rye breeding whereas bottleneck events presumably occurred before the considered timeframe. Thus, rye breeding in future might highly benefit from the integration of genetically distant relatives into new breeding programs. With this large-scale analysis a first-generation map of selection for rye was established indicating genomic regions and target genes with selection signature mainly involved in stress response, self-incompatibility and frost tolerance. In future, accumulation of rye genome sequencing data, the establishment of a high-density SNP genotyping array and ongoing research on the rye genome will probably allow the development of an urgently needed genomic reference sequence of rye and will overcome the limitations of this project with regard to data analysis (e.g. uneven sequence coverage, missing genome annotation). Following this progress, applications of more advanced methods to detect genome-wide selection signatures might follow the present project that yield a first glimpse on artificial selection in the rye breeding history of the seed parent pool.