Min-Max Approximate Dynamic Programming

B. O'Donoghue, Y. Wang, and S. Boyd

Proceedings IEEE Multi-Conference on Systems and Control, pages 424–431, September 2011.

Proceedings MSC paper

In this paper we describe an approximate dynamic programming policy for a discrete-time dynamical system perturbed by noise. The approximate value function is the pointwise supremum of a family of lower bounds on the value function of the stochastic control problem; evaluating the control policy involves the solution of a min-max or saddle-point problem. For a quadratically constrained linear quadratic control problem, evaluating the policy amounts to solving a semidefinite program at each time step. By evaluating the policy, we obtain a lower bound on the value function, which can be used to evaluate performance: When the lower bound and the achieved performance of the policy are close, we can conclude that the policy is nearly optimal. We describe several numerical examples where this is indeed the case.