Model-based Configuration of Algorithms for Solving Hard Computational Problems
Final Report Abstract
Parameterized algorithms are abundant in computer science and its applications. For many computational problems, there exist a wide array of solution approaches, and computer scientists need to identify the one that best matches domain-specific requirements. Typically, once a general solution approach has been chosen, there are a number of subsequent lower-level choices to be made before arriving at a complete algorithm specification. Often, some of those choices are left open; these free parameters allow users to adapt the algorithm to their particular scenario. As an example of a highly-parameterized algorithm, consider IBM ILOG CPLEX, a commercial optimization tool for solving challenging mixed integer programming (MIP) problems that occur prominently both in academia and in a wide range of important industrial applications, such as production planning and optimization, scheduling, logistics, vehicle routing, and the sustainable management of resources. CPLEX has about 80 parameters (giving rise to 1038 different combinations of parameters) that affect the solver’s search mechanism and can be configured by the user. Other examples of parameterized algorithms can be found in areas as diverse as sorting, linear algebra, numerical optimization, compiler optimization, parallel computing, computer vision, machine learning, database query optimization, database server optimization, protein folding, formal verification, and even in areas outside of computer science, such as water resource management. Given the wide spread of parameterized algorithms, a problem routinely encountered by designers and end users of parameterized algorithms is that of finding good instantiations of the parameters, for which the empirical performance on a given set of problem instances is optimized. This algorithm configuration problem was the central problem tackled by the project at hand, using an approach that is based on machine learning models that predict algorithm performance with untested parameter settings and that uses these predictions to guide the search for good parameter settings. This project substantially improved the state-of-the-art for this problem, and has, amongst others, delivered a tool called SMAC that has been used to optimize a wide range of algorithms. This tool was published in a paper that won the runner-up best paper award at the LION conference 2011 and has already been cited over 500 times to date. Other technical advances of the project were to generalize the underlying machine learning models to deal with arbitrary types of parameters, propose the first general model-based algorithm configuration method, develop an adaptive capping method to substantially speed up SMAC, develop a parallel version of SMAC, and improve the Hydra approach for combining algorithm configuration and algorithm selection to enable the selection of optimal parameter settings on a per-instance basis.
Publications
-
Sequential Model-Based Optimization for General Algorithm Configuration. Proceedings of the Conference on Learning and Intelligent OptimizatioN (LION 5), 2011
F. Hutter, H. Hoos, and K. Leyton-Brown
-
Bayesian Optimization With Censored Response Data. NIPS workshop on Bayesian Optimization, Sequential Experimental Design, and Bandits (BayesOpt’12), 2012
F. Hutter, H. Hoos, and K. Leyton-Brown
-
Parallel Algorithm Configuration. Proceedings of the Conference on Learning and Intelligent OptimizatioN (LION 6), 2012
F. Hutter, H. Hoos, and K. Leyton-Brown
-
Identifying Key Algorithm Parameters and Instance Features using Forward Selection. Proceedings of the 7th International Conference on Learning and Optimization (LION 7)
F. Hutter and H. Hoos, and K. Leyton-Brown