Project Details
Dual Training of Nonlinear Support Vector Machines on a Budget
Applicant
Professor Dr. Tobias Glasmachers
Subject Area
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Term
from 2016 to 2020
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 287461288
Machine learning algorithms create predictive models from data. The discipline connects statistics, computer science, and optimization. Support Vector Machines (SVMs) have been established as a standard method. SVM classifiers are applied in many areas of science and technology, e.g., in bioinformatics, robotics, (medical) image processing, and many more.SVM Training on a BudgetTraining an SVM model requires solving an optimization problem. The established standard solver is a dual decomposition algorithm. Despite its relatively high efficiency it fails to keep up with the rapid growth of available data sets ("big data" challenge). This is due to the form of the SVM model, which is a linear expansion of basis functions, the number of which grows with the size of the data.This growth of the number of basis functions with the data set size has been identified as a key obstacle for efficient SVM training on large-scale data. Budget methods are a relatively recent approximation approach. They define an a priori upper bound on the allowable number of basis functions (the budget). This results in faster iterations of the training algorithm, yielding an overall reduction of the runtime.The choice of the budget size by the user defined an implicit trade-off between computation time per iteration and solution accuracy. Getting this value right is key to the success of learning. Hence the strong dependence of this value on the (type of) data is problematic. Furthermore, budget methods are currently available only for primal optimization algorithms, which converge at a significantly slower rate compared to the standard solver.Project GoalsA first goal of the project is the development of a dual decomposition solver with budget constraint. This approach promises to combine the fast convergence of the dual standard algorithm with fast iterations of the budget approach. The expected result is a significant reduction of computation time over the standard approach as well as over primal budget methods.A second goal is the development of methods for the automatic adjustment of the budget size. The need for such methods is independent of the optimization algorithm. The budget will be adjusted so that it is best suited either given a training time or a target accuracy. Both approaches allow for the specification of a meaningful and data independent goal of SVM training on a budget.The algorithms developed within this project will be underpinned with an analysis of convergence behavior and approximation error. An efficient implementation will be made available under a free software license.
DFG Programme
Research Grants