Design, analysis, development and experimental validation of algorithms for high throughput sequencing mass data using the SeqAn library for biological sequence analysis

Applicant Professor Dr. Knut Reinert

Subject Area Bioinformatics and Theoretical Biology

Term from 2010 to 2015

Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 192954395

During the last five years modern sequencing technologies have brought a super-exponential growth of sequencing capacities. At the time of writing this proposal it is possible to sequence about 30 billion nucleotides per day using one sequencing machine. This proposal aims to respond to the described increase of genomic sequence data with algorithmic approaches that benefit from redundancies across multiple datasets. More specifically we aim at: 1) Developing a data structure representing one or more genomic sequences by storing only the differences to a similar reference sequence while maintaining the ability to navigate quickly in all sequences. We then us this data structure for developing algorithms to transform the substring index data structure of a reference to the substring index of a new genome without rebuilding it from scratch and by only storing the differences to the reference index. 2) Developing algorithms that efficiently process multiple genomes in parallel based on the representation developed in 1). 3) Bridging the gap between algorithm theory and practical implementations by extending SeqAn as a library providing the core algorithmic components required to analyze large-scale genomic data and as an experimental platform to design, analyze, and implement state-of-the-art bioinformatics algorithms.

DFG Programme Research Grants

Servicenavigation

Hauptnavigation

Design, analysis, development and experimental validation of algorithms for high throughput sequencing mass data using the SeqAn library for biological sequence analysis

Additional Information

Servicenavigation

Hauptnavigation

Design, analysis, development and experimental validation of algorithms for high throughput sequencing mass data using the SeqAn library for biological sequence analysis

Additional Information

Textvergrößerung und Kontrastanpassung