Project Details
A novel second-order derivative and solution methodology for nonsmooth optimization
Applicant
Dr. Bennet Gebken
Subject Area
Mathematics
Term
since 2024
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 545166481
In many applications, optimization problems arise with objective functions that are not smooth. There are well-established methods for the solution of these problems based on first-order derivative information, like the Clarke subdifferential, and linear models of the objective function. But since they only employ linear models, the order of convergence of these methods is at best linear. In smooth optimization, there are well-known solution methods based on second-order derivatives (e.g., Newton’s method), which are able to achieve a higher order of convergence. Thus, the question naturally arises whether such methods can be constructed for the non-smooth case as well. While methods have been proposed that are able to mimic the behavior of Newton’s method in the non-smooth case and that show hints of super-linear convergence in numerical examples, theoretical results on the order of convergence of these methods are almost non-existent. The goal of this project is to develop a novel second-order solution methodology for non-smooth optimization problems that combines the fast convergence of existing higher-order methods with a proper theoretical foundation in non-smooth analysis. As a second-order derivative, the second-order epsilon-jet is used, which contains the coefficients of all (existing) second-order Taylor expansions in an epsilon-ball around a given point. Based on this concept, a model is defined as the maximum of these Taylor expansions. Iteratively constructing and minimizing this model results in a simple, abstract descent method for non-smooth optimization. To obtain a practical method, the epsilon-jet is approximated via a deterministic adaptive sampling scheme. As such, the method can be interpreted as a second-order gradient sampling method. In a recent preprint, I have already presented the first ideas of this new method and presented numerical results that suggest that, in terms of function evaluations, it is often faster than other higher-order methods. However, the key theoretical and practical questions have not yet been addressed, which is the goal of this project. First of all, the theoretical analysis of the method has to be further developed. This includes the proof of convergence in the nonconvex case and a theoretical analysis of the order of convergence. Secondly, the efficiency of the method has to be improved. At this point, the method relies on having exact Hessian matrices, which are often difficult to obtain in practice. Furthermore, the efficiency is hindered by the fact that no specialized solver has been proposed for the minimization of the second-order model. Finally, the practical capabilities of the method have to be investigated through a more thorough comparison to other higher-order solution methods and the application to practical problem classes.
DFG Programme
WBP Position