7-quadratically-constrained-quadratic-programming
Quadratically Constrained Quadratic Programming
Consider the quadratically constrained quadratic programming: \[ \label{qcqp} \tag{QCQP} \begin{aligned} \inf \quad & x^T Q x \\ \text{s.t.} \quad & x^T A_i x \ge b_i, \forall i=1,\dots,m \end{aligned} \] where \(Q, A_1, \dots, A_m \in \mathcal{S}^n\). By far, no convexity is not assumed. But on the other hand, we can apply the semi-definite relaxation technique to transform the \(\eqref{qcqp}\).
Observe that \(x^T Q x = \tr(x^T Q x) = \tr(Q x x^T) = Q \bullet x x^T\), which is linear in \(x x^T\). We can apply the same trick to the constraints so that \(\eqref{qcqp}\) is equivalent to \[ \begin{aligned} v^* = \inf \quad & Q \bullet X \\ \text{s.t.} \quad & A_i \bullet X \ge b_i, \forall i=1,\dots,m \\ & \exists x \in \R^n, X = x x^T \\ \text{ non-convex??} \end{aligned} \] Note that \(\exists x \in \R^n, X = x x^T \iff X \in \mathcal{S}_+^n, \rank(X) \le 1\). That is. \[ \begin{aligned} \inf \quad & Q \bullet X \\ \text{s.t.} \quad & A_i \bullet X \ge b_i, \forall i=1,\dots,m \\ & X \succeq 0 \\ & \rank(X) \le 1 \end{aligned} \] The \(\rank\) function is not convex. We may as well drop the rank constraint so that the semi-definite relaxation of \(\eqref{qcqp}\) is \[ \label{relaxed-qcqp} \tag{Relaxed QP} \begin{aligned} v_R^* = \inf \quad & Q \bullet X \\ \text{s.t.} \quad & A_i \bullet X \ge b_i, \forall i=1,\dots,m \\ & X \succeq 0 \end{aligned} \] Observe that \(v^* \ge v_R^*\) and \(\eqref{relaxed-qcqp}\) is a SDP.
Max-cut Problem
Let \(G = (V,E)\) be an undirected graph and \(w: E \mapsto \R_+\) be a weight function on \(E\). A subset \(S \subseteq V\) defines a cut and the value of a cut is \[ w(S) \triangleq \sum_{(i,j) \in E, i \in S, j \notin S} w_{ij} \] The goal is to find \(S \subseteq V\) such that \(w(S)\) is maximized. The minimization problem is trivial, simply choosing \(S\) as \(V\) or \(\emptyset\) gives the minimum value \(0\). Let \(x_i \in \{ -1, +1 \}\) be a binary variable indicating whether vertex \(i\) is in the cut (\(+1\)) or not (\(-1\)). Then, \[ \label{mc} \tag{Max-cut} \begin{aligned} v^* = \max \quad & \sum_{(i,j) \in E} \frac{1}{2} w_{ij} (1 - x_i x_j) \\ \text{s.t.} \quad & x_i^2 = 1, \forall i=1,\dots,n \\ \end{aligned} \] Apply the SDR technique (there is a middle step to convert the above to a QP) to get \[ \label{relaxed-mc} \tag{Relaxed Max-cut} \begin{aligned} v_\text{sdr}^* = \max \quad & \sum_{(i,j) \in E} \frac{1}{2} w_{ij} (1 - X_{ij}) \\ \text{s.t.} \quad & X_{ii} = 1, \forall i=1,\dots,n \\ & X \succeq 0 \end{aligned} \] Note that \(v^* \le v_\text{sdr}^*\). If \(\eqref{relaxed-mc}\) is solved with a rank-one matrix \(X^*\), we can automatically decompose it to give the optimal solution to \(\eqref{mc}\). The crux is how to preserve the optimality when \(X^*\) is of rank higher than one. Here is an algorithm for it:
- Solve \(\eqref{relaxed-mc}\) to get an optimal solution \(X^*\). Let \(X^* = U^T U\) where \(U \in \R^{n \times n}\). Let \(u_i \in \R^n\) be the \(i\)-th column of \(U\). We have \(\|u_i\|_2^2 = u_i^T u_i = X_{ii}^* = 1\).
- Let \(r \in \R^n\) be a random vector uniformly distributed on the sphere \(S^n = \{ x \in \R^n: \|x\|_2 = 1 \}\) (this can be done by normalizing the samples from standard Gaussian in \(\R^n\)).
- Let \(x_i' = \sign(u_i^T r)\) where \(\sign(z)\) is \(+1\) if \(z \ge 0\) or \(-1\) otherwise. Return \(\{ x_i': i=1,\dots,n \}\) as a feasible solution to \(\eqref{mc}\). This is also referred to as the hyperplane rounding.
In the first place, \(x_i'\)s are feasible. Let \(v'\) be the objective value associated with the cut \(\{ x_i': i=1,\dots,n \}\). Clearly, \(v' \le v^*\). To analyze its approximation bound, firstly note that \(\{ x_i': i=1,\dots,n \}\) is random. We can only consider the \(\E[v']\). \[ \begin{aligned} &\E[v'] = \frac{1}{2} \E[\sum_{(i,j) \in E} w_{ij} (1 - x_i' x_j')] \\ &= \sum_{(i,j) \in E} w_{ij} \E[\frac{1 - x_i' x_j'}{2}] \\ &= \sum_{(i,j) \in E} w_{ij} \Pr[\sign(u_i^T r) \ne \sign(u_j^T r)] \\ \end{aligned} \] Let \(u, v \in S^n\) be arbitrary, \(r\) be uniformly distributed on \(S^{n-1}\). Then, \[ \Pr[\sign(u^T r) \ne \sign(v^T r)] = \frac{\arccos(u^T v)}{\pi} \] Under the above setting, for any \(z \in [-1, 1]\) and \(\theta\) such that \(\cos \theta = z\), \[ \frac{\arccos z}{\pi} = \frac{2 \theta}{\pi(1 - \cos \theta)} \frac{1}{2} (1 - z) \\ \ge \alpha \cdot \frac{1}{2} (1-z) \] where \(\alpha = \min_{0 \le \theta \le \pi} \frac{2 \theta}{\pi (1 - \cos \theta)} > 0.878\). As a result, \[ \begin{aligned} &\E[v'] = \sum_{(i,j) \in E} w_{ij} \frac{\arccos u_i^T u_j}{\pi} \\ &\ge \sum_{(i,j) \in E} w_{ij} \alpha \cdot \frac{1}{2} (1 - \underbrace{u_i^T u_j}_{X_{ij}^*}) \\ &= \alpha \sum_{(i,j) \in E} \frac{1}{2} w_{ij} (1 - X_{ij}^*) \\ &= \alpha v_\text{sdr}^* \ge \alpha v^* \end{aligned} \]