Parameterized Algorithmics for Bioinformatics
Final Report Abstract
We have developed efficient & exact algorithms for numerous computationally hard problems in bioinformatics. We have given particular focus to developing algorithms which perform fast in practice. Noteworthy examples include the Cluster Editing problem, where our algorithms can solve instances exactly with tens of thousands of vertices and edge modifications; and the Maximum Colorful Subtree problem, where our code solves millions of instances every day for the interpretation of metabolomics data. All problems have direct applications in bioinformatics and biology, be it clustering, phylogenetics, biochemistry, metabolomics, or the analysis of genome rearrangements. On the theoretical side, we have developed numerous parameterized algorithms with improved worst-case running times for NP-hard problems; for other problems, we have proven that parameterized algorithms are very unlikely to exist. In bioinformatics research, the objective function of a problem is usually only a “crutch” used to find the optimum structure, whereas the value of the objective function has little or no meaning. Even elaborate heuristics for an NP-hard problem, which are capable of finding solutions with objective function value very close to the optimum, can result in solutions which are structurally very dissimilar from the optimum structure. We were able to show that this is not only a theoretical possibility, but happens regularly for real-world instances. This underlines the importance of finding exact solutions for bioinformatics problems; the structure of heuristic solutions, including local search heuristics such as Markov chain Monte Carlo, may deviate significantly from the optimal solution.
Publications
-
Exact Algorithms for Cluster Editing: Evaluation and Experiments. Algorithmica, 60(2):316-334, 2011
Böcker, Sebastian; Briesemeister, Sebastian & Klau, Gunnar W.
-
Kernel and Fast Algorithm for Dense Triplet Inconsistency. In Proc. of Theory and Applications of Models of Computation (TAMC 2010), volume 6108 of Lect Notes Comput Sci, pages 247-257. Springer, Berlin, 2010
Guillemot, Sylvain & Mnich, Matthias
-
A Closer Look at the Closest String and Closest Substring Problem. In Proc. of Algorithm Engineering & Experiments (ALENEX 2011), pages 13-24. SIAM, 2011
Chimani, Markus; Woste, Matthias & Böcker, Sebastian
-
Automated Bond Order Assignment as an Optimization Problem. Bioinformatics, 27(5):619-625, 2011
Dehof, Anna Katharina; Rurainski, Alexander; Bui, Quang Bao Anh; Böcker, Sebastian; Lenhof, Hans-Peter & Hildebrandt, Andreas
-
Comprehensive cluster analysis with Transitivity Clustering. Nat Protocols, 6:285-295, 2011
Wittkop, Tobias; Emig, Dorothea; Truss, Anke; Albrecht, Mario; Böcker, Sebastian & Baumbach, Jan
-
Finding and counting vertex-colored subtrees. Algorithmica, 65(4):828-844, 2013
Guillemot, Sylvain & Sikora, Florian
-
On the Parameterized Complexity of the Repetition Free Longest Common Subsequence Problem. Inform Process Lett, 112(7):272-276, 2012
Blin, Guillaume; Bonizzoni, Paola; Dondi, Riccardo & Sikora, Florian
-
Finding Maximum Colorful Subtrees in practice. J Comput Biol, 20(4):1-11, 2013
Rauf, Imran; Rasche, Florian; Nicolas, François & Böcker, Sebastian
-
The generalized Robinson-Foulds metric. In Proc. of Workshop on Algorithms in Bioinformatics (WABI 2013), volume 8126 of Lect Notes Comput Sci, pages 156-169. Springer, Berlin, 2013
Böcker, Sebastian; Canzar, Stefan & Klau, Gunnar W.
-
Speedy Colorful Subtrees. In Proc. of Computing and Combinatorics Conference (COCOON 2015), volume 9198 of Lect Notes Comput Sci, pages 310-322. Springer, Berlin, 2015
White, W. Timothy J.; Beyer, Stephan; Dührkop, Kai; Chimani, Markus & Böcker, Sebastian
-
Exact and heuristic algorithms for Cograph Editing. Technical report
W. Timothy J. White, Marcus Ludwig and Sebastian Böcker
-
Heuristic algorithms for the Maximum Colorful Subtree problem. Technical report
Kai Dührkop, Marie A. Lataretu, W. Timothy J. White and Sebastian Böcker
