Stochastic Proximal Iteration: A Non-Asymptotic Improvement Upon Stochastic Gradient Descent

E. Ryu and S. Boyd

Manuscript, 2017.

Manuscript

In many stochastic optimization problems, the learner is provided with random functions, and the common practice has been to differentiate the said functions and perform stochastic gradient descent (SGD). However, to use just the gradient and not the entire function seems suboptimal. To address this issue, we present an algorithm which we call stochastic proximal iteration (SPI). Each iterate of SPI is obtained by applying the proximal mapping with respect to the given random function to the previous iterate. This makes SPI an online algorithm with a computational cost comparable to that of SGD in certain interesting settings. Using machinery from monotone operator theory, we show that SPI has an asymptotic performance equivalent to SGD but does better in the non-asymptotic regime.