The Continuous Stochastic Gradient Method

Language
en
Document Type
Doctoral Thesis
Granting Institution
Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Naturwissenschaftliche Fakultät
Issue Date
2024
Authors
Uihlein, Andrian
Editor
Abstract

We consider a class of optimization problems in which the objective function requires the computation of an expected value of a hard to evaluate property. Such structures arise in probabilistic settings, e.g., expected values in optimization under uncertainty, as well as fully deterministic contexts, e.g., optimizing a design for a (possibly infinite) set of scenarios. Therefore, both stochastic and deterministic optimization schemes have been considered to solve such problems. The latter typically rely on a fixed quadrature rule to approximate all appearing integrals numerically. As a result, the associated computational cost may render such approaches numerically infeasible in complex settings. Moreover, discretization can lead to artificial local solutions emerging in the problem, if the corresponding quadrature rule is not chosen carefully. Contrary, stochastic sample based methods do not suffer from this problem and provide a computationally cheap alternative. However, these schemes are typically limited to objective functions of very simple structure, i.e., minimizing an expected value. Settings in which the objective consists of several integrals, possibly coupled by nonlinear functions, are beyond the scope of such techniques. In this work, we consider continuous stochastic optimization approaches, specifically, a continuous stochastic gradient (CSG) scheme. To keep the numerical effort per iteration low, these methods also rely on random gradient samples instead of full integrals. However, unlike traditional stochastic sample based strategies, the gradient information is stored over the course of iterations. Thus, by calculating suitable design dependent integration weights, a convex combination of old gradient samples is used to build an approximation to the full gradient. We show that the approximation error almost surely converges to zero during the optimization process. As a result, we prove that CSG admits the same convergence results as a full gradient scheme, i.e., convergence for constant step sizes and line search techniques based on the continuous stochastic approximations. Moreover, it means that CSG is not restricted to the minimization of simple expected values and can instead deal with arbitrary combinations of integrals and Lipschitz continuous functions in the objective. In this sense, CSG can be considered as a hybrid of stochastic gradient methods and a classical gradient descent scheme. Several methods to obtain the aforementioned integration weights in practice are presented and discussed. Furthermore, we provide a full theoretical convergence analysis of CSG and perform numerical experiments, covering all important aspects of continuous stochastic optimization methods. For several large scale applications from the field of topology optimization, the proposed schemes are compared to different optimization techniques known from literature.

DOI
URN
Faculties & Collections