In many contemporary optimization problems, such as parameter training for deep learning architectures, it is computationally challenging or even infeasible to evaluate an entire function or its derivatives. This necessitates the use of stochastic algorithms that sample problem data, which can jeopardize the guarantees classically obtained through globalization techniques via a trust region or a line search. Using subsampled function values is
particularly challenging for the latter strategy, that relies upon multiple evaluations. On top of that all, there has been an increasing interest for nonconvex formulations of data-related problems. For such instances, one aims at developing methods that converge at a reasonable rate to second-order stationary points, which is particularly delicate to ensure when one only accesses subsampled approximations of the objective and its derivatives.
In this talk I present a stochastic algorithm based on negative curvature and Newton-type directions, computed for a subsampling model of the objective. A line-search technique is used to enforce suitable decrease for this model,
and for a sufficiently large sample, a similar amount of reduction holds for the true objective. By using probabilistic reasoning, we can then obtain worst-case complexity guarantees for our framework. Furthermore I demonstrate the practical performance of the algorithm on some nonconvex optimization problems arising from training CNNs.