Pan-Genome Structures: Design, Construction, and Applications
Final Report Abstract
Due to the enormous progress in sequencing technologies, for some species the genomes of many individuals are available. Although the DNA sequences of two individuals of the same species are in general very much alike, they differ at many sites. Similarities and differences can, for example, be represented by a pangenome graph, where the term ‘pangenome’ refers to the whole genomic content of a species. The goal was to develop a data structure that efficiently supports as many of the functionalities demanded by the Computational Pan-Genomics Consortium as possible. We were able to provide new contributions to the following topics: construction and efficient storage of such a pangenome structure, coordinate systems of pangenome graphs, visualization of subgraphs, and pangenomic read mapping.
Publications
-
An improved encoding of genetic variation in a Burrows–Wheeler transform. Bioinformatics, 36(5), 1413-1419.
Büchler, Thomas & Ohlebusch, Enno
-
Dynamic construction of pan-genome subgraphs. Open Computer Science, 10(1), 82-96.
Dede, Kadir & Ohlebusch, Enno
-
Coordinate Systems for Pangenome Graphs based on the Level Function and Minimum Path Covers. Proceedings of the 14th International Joint Conference on Biomedical Engineering Systems and Technologies, 21-29. SCITEPRESS - Science and Technology Publications.
Büchler, Thomas; Räther, Caroline; Weber, Pascal & Ohlebusch, Enno
-
Edge minimization in de Bruijn graphs. Information and Computation, 285, 104795.
Baier, Uwe; Büchler, Thomas; Ohlebusch, Enno & Weber, Pascal
-
Efficient short read mapping to a pangenome that is represented by a graph of ED strings. Bioinformatics, 39(5).
Büchler, Thomas; Olbrich, Jannik & Ohlebusch, Enno
-
Generating multiple alignments on a pangenomic scale. Bioinformatics, 41(3).
Olbrich, Jannik; Büchler, Thomas & Ohlebusch, Enno
