Graph clustering and community detection via Semidefinite Programming

 

Motivation: Given an undirected graph G=(V=[n],E), we want to partition its vertices into disjoint subsets. This task arises in applications to clustering, and network analysis. A common strategy is the following:

  1. Construct a low-dimensional embedding of the vertices. Explicitly, for each vertex iin V, contruct a vector {mathbf x}_iin  {mathbf R}^k, with ambient dimension kll n.

  2. Use some greedy procedure to cluster the points {mathbf x}_1,dots,{mathbf x}_n.

The code presented here provides an efficent way to implement step 1 by solving a semidefinite programming relaxation.

On the left side, we show a small two-dimensional example. (A two-dimensional embedding of a network of political books from Mark Newman's repository. Colors correspond to political orientation. The embedding separates quite clearly two groups.)

Semidefinite Programming (SDP): Let A denote the adjacency matrix of G, and {mathbf 1} the ‘all ones’ vector. We then use the following SDP:

 begin{array}{ll} {rm maximize} & langle A-gamma {mathbf 1}{mathbf 1}^{sf T},Xrangle  {rm subj. to} & Xsucceq 0 & X_{ii} = 1; forall iin [n] end{array}

Here langle B,Crangle  = {rm Trace}(B^{sf T}C) denotes the usual scalar product between matrices, and we write B succeq 0 if B is positive semidefinite.

Solving this SDP by generic methods can be quite slow. We reduce the complexity by constraining X to have rank mll n, and changing variables. We introduce the ‘spin’ variables sigma_iin{mathbf R}^m and attempt to solve the following non-convex problem:

 begin{array}{ll} {rm maximize} & sum_{(i,j)in E} sigma_icdotsigma_j-frac{gamma}{2}big|Mbig|_2^2  {rm subj. to} & |sigma_{i}|_2 = 1; forall iin [n], end{array}

where Mequivsum_{i=1}^nsigma_i. Somewhat surprisingly, we found that m as small as 10 to 20 is sufficient to guarantee global convergence in most practical examples (even if n>10^5).

The embedding {mathbf x}_i is simply a low-dimensional projection of the spin sigma_i (with kle m).

Paper

A. Javanmard, A. Montanari, and F. Ricci-Tersenghi, Phase Transitions in Semidefinite Relaxations, 2015

People

Related Papers

S. Burer, R. D.C. Monteiro, A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization, Mathematical Programming 2003

A. Montanari and S.Sen, Semidefinite Programs on Sparse Random Graphs and their Application to Community Detection, 201