Efficient Projections onto the L1-Ball for Learning in High Dimensions

John Duchi, Shai Shalev-Shwartz, Yoram Singer, and Tushar Chandra

International Conference on Machine Learning (ICML 2008)

Note: The first part of our paper (the linear time algorithm) was solved in greater generality by the two papers below. Thanks to Ohad Shamir for pointing out the second.

An O(n) Algorithm for Quadratic Knapsack Problems. Peter Brucker. Operations Research Letters, Volume 3, No. 3, 1984. Pages 163-166.
An algorithm for a singly constrained class of quadratic programs subject to upper and lower bounds. P. M. Pardalos and N. Kovoor. Mathematical Programming, Volume 46, 1990. Pages 321-328.

In any case, below are our papers:

Conference paper
Other projection A short note in which I show that projection to an elastic net constraint can be solved similarly.
Matlab code to project onto probability simplex
Matlab code to project onto L1 ball

We describe efficient algorithms for projecting a vector onto the L1-ball. We present two methods for projection. The first performs exact projection in O(n) expected time, where n is the dimension of the space. The second works on vectors k of whose elements are perturbed outside the L1-ball, projecting in O(k\log(n)) time. This setting is especially useful for online learning in sparse feature spaces such as text categorization applications. We demonstrate the merits and effectiveness of our algorithms in numerous batch and online learning tasks. We show that variants of stochastic gradient projection methods augmented with our efficient projection procedures outperform interior point methods, which are considered state-of-the-art optimization techniques. We also show that in online settings gradient updates with L1 projections outperform the exponentiated gradient algorithm while obtaining models with high degrees of sparsity.