5-conic-linear-programming

Though some problems can be relaxed to accommodate the linear programming form, applications of LP are still limited due to its linear constraints. To extend, a natural idea is to gradually allow for other kinds of non-linear constraints. But that would go too far away from the established theories for linear programming.

In this post, we study the conic linear programming, which is a phased open-up to non-linear problems.

Concept Preparation

“Good” Order and “Proper” Cone

Before exploring other non-linear problems, we first study the properties of a “good” order \(\succeq\). between vectors. We expect it to have the following basic three properties of a partial-order:

  • Reflexivity: \(\forall u \in \R^n, u \succeq u\);

  • Anti-symmetry \[ \forall u,v \in \R^n, u \succeq v, v \succeq u \to u = v \]

  • Transitivity \[ \forall u,v,w \in \R^n, u \succeq v, v \succeq w \to u \succeq w \]

Other than them, we expect \(\succeq\) to further possess the following two arithmetic properties:

  • Homogeneity \[ \forall u,v \in \R^n, u \succeq v, \alpha \succeq 0, \alpha u \succeq \alpha v \]

  • Additivity \[ \forall u,v,w,z \in \R^n, u \succeq v, w \succeq z \to u + w \succeq v + z \]

We say the \(\succeq\) order is “good” if it satisfies the above five properties. The above five is from the algebraic perspective. To illustrate it geometrically, consider the set \(K \triangleq \{ x \in \R^n: x \ge 0 \}\). We say \(K\) is a “proper” cone in that it is

  • non-empty and closed under addition \[ \forall u, v \in K, u + v \in K \]

  • conic \[ \forall u \in K, \alpha > 0 \to \alpha u \in K \\ \]

  • pointed \[ u, -u \in K \to u = 0 \]

We further claim that a proper cone is convex and contains the zero vector (verify it).

Remark: The pointedness plus closeness (not closeness under addition) implies that there is no complete line in this cone, which equivalently means there is no non-trivial subspace (i.e., except \(\{ 0 \}\) and the universe) inside this cone. To show it, let \(K\) be a closed pointed cone, suppose on the contrary there exists \(u, v \in K\) such that for every \(\alpha \in \R\), we have \(u + \alpha(v - u) \in K\). For \(t > 1\), consider the sequence \(\{w_+^t\}_{t=1}^{\infty}\) and \(\{w_-^t\}_{t=1}^{\infty}\) where \[ w_+^t = \frac{u + t(v-u)}{\|u + t(v-u)\|_2}, w_-^t = \frac{u - t(v-u)}{\|u - t(v-u)\|_2} \] Now \(w_+^t, w_-^t \in K\) for all \(t \ge 1\). But \(w_+^t \to w \triangleq \frac{v-u}{\|v-u\|_2}\) and \(w_-^t \to -w\) . Since \(K\) is closed, we have \(w, -w \in K\). However, since \(w\) has unit norm and is not zero vector, it follows that \(K\) is not pointed, which is a contradiction.

Remark: At times, people also refer to this as the salient property of a cone. A proper cone is variously defined on a subset of these properties (closeness, closeness under addition, pointed, salient, and essentially, conic) depending on the context.

In fact, the algebraic properties and the geometric properties can derive each other. A good order and a proper cone have a one-to-one relationship. Now we ask, in an arbitrary \(n\)-d universe (or a finite-dimensional Euclidean space), is \(\ge\) the only good order, or equivalently, is \(\R_+^n\) the only proper cone? The answer is no.

Examples of Proper Cone

  1. Lorentz cone / second-order cone / ice cream cone \[ \mathcal{Q}^{n+1} \triangleq \{ (t, x) \in \R \times \R^n: t \ge ||x||_2 \} \] \(\mathcal{Q}^{n+1}\) is a closed proper cone. Then what is the good order associated with this proper cone?

  2. Semi-definite cone \[ \mathcal{S}_+^n \triangleq \{ X \in \mathcal{S}^n: u^T X u \ge 0, \forall u \in \R^n \} \] \(\mathcal{S}_+^n\) is a closed proper cone. Then what is the good order associated with this proper cone?

  3. Zero cone: \(\{ 0 \}\)

  4. New cones from old ones by Cartesian product

    Let \(K_1, \dots, K_m\) be closed proper cones (with non-empty interior). Then \[ K \triangleq K_1 \times \dots \times K_m = \{ (x_1, \dots, x_m): x_i \in K_i, \forall i=1,\dots,m \} \] is a closed proper cone with non-empty interior (with non-empty interior).

    Remark: \(\R_+^n, \mathcal{Q}^{n+1}, \mathcal{S}_+^n\) have non-empty interior: \[ \intr(\R_+^n) = \R_{++}^n \\ \intr(\mathcal{Q}^{n+1}) = \{ (t, x) \in \R \times \R^n: t > ||x||_2 \} \\ \intr(\mathcal{S}_+^n) = \mathcal{S}_{++}^n \\ \]

Conic Linear Programming

Formulation

Recall that linear programming is

\[ \begin{align*} \min & \quad c^T x \\ \text{s.t.} & \quad b - A x \ge 0 \end{align*} \] Now replacing \(\ge\) with the good order \(\succeq\) gives the conic linear programming. \[ \begin{align*} \tag{CLP} \min & \quad c^T x \\ \text{s.t.} & \quad b - A x \succeq 0 \end{align*} \] The next question is, can we recover Farkas’ lemma and strong duality in this setting? The answer is yes. And the good aspect of the good order is that we can recover the conclusions in linear programming verbatim. Next we generalize the LP problem and show the same result applies.

Let \(E\) be a finite-dimensional Euclidean space (e.g. \(\R^n, \mathcal{S}^n\)) and \(\langle \cdot, \cdot \rangle\) be the inner product on \(E\) (e.g., on \(\R^n\), \(\langle x, y \rangle = x^T y\); on \(\mathcal{S}^n\), \(\langle X, Y \rangle = \tr(X^T Y)\)). Let \(K \subseteq E\) be a closed proper cone. Then we can have a good order \(\succeq\) on \(E\). Consider the standard form of CLP as well as the primal problem: \[ \begin{equation} \tag{P} \label{primal} \begin{aligned} v_p^* = \inf \quad & \langle c, x \rangle \\ \text{s.t.} \quad & \langle a_i, x \rangle = b_i, \forall i=1,\dots,m \\ & x \in K \subseteq E \end{aligned} \end{equation} \] What is its dual? We mimic the procedure when we build the lower bound for LP. Let \(y \in \R^m\). By the linearity of inner product, \[ \langle b, y \rangle = b^T y = \sum_{i=1}^m \langle a_i, x \rangle y_i = \sum_{i=1}^m \langle y_i a_i, x \rangle = \langle \sum_{i=1}^m y_i a_i, x \rangle \] If we impose \(c \succeq \sum_{i=1}^m y_i a_i\) like \(c \ge A^T y\), can we draw \(\langle c - \sum_{i=1}^m y_i a_i, x \rangle \ge 0\) like \(x^T(c - A^T y) \ge 0\)? The answer is that, this constraint is not enough in general. To reinforce the constraint, construct the set \[ K^* = \{ w \in E: \langle w, x \rangle \ge 0, \forall x \in K \} \]

We can draw some properties for \(K^*\):

  • \(K^*\) is non-empty because \(0 \in K^*\).

  • \(K^*\) is always closed and convex (no matter what \(K\) is, because \(K^*\) is the intersection of hyperplanes which are closed and convex).

  • \(K^*\) is conic.

  • If \(K\) is a closed proper cone with non-empty interior, then so is \(K^*\). That is, \(K\) and \(K^*\) are heterogenous on the premise that \(K\) is a closed proper cone.

  • We have (verify it) \[ (\R_+^n)^* = \R_+^n, (\mathcal{Q}^{n+1})^* = \mathcal{Q}^{n+1}, (\mathcal{S}_+^n)^* = \mathcal{S}_+^n \]

In fact, \(K^*\) is called the dual cone. We impose that \(c - \sum_{i=1}^m y_i a_i \in K^*\). Then, the dual problem can be written as \[ \begin{equation} \tag{D} \label{dual} \begin{aligned} v_d^* = \sup \quad & b^T y \\ \text{s.t.} \quad & c - \sum_{i=1}^m y_i a_i \in K^* \\ & y \in \R^m \end{aligned} \end{equation} \]

The constraint of CLP’s dual in essence describes that an affine map of the variable \(y\) belongs to a cone. This is a good indicator on whether we are dealing with a CLP.

Second-order Cone Programming

Consider the second-order cone programming: \[ \begin{equation} \tag{SOCP} \begin{aligned} \sup \quad & b^T y \\ \text{s.t.} \quad & c - \sum_{i=1}^m y_i a_i \in \mathcal{Q}^{n+1} \\ & y \in \R^m \end{aligned} \end{equation} \] Let \(a_i = [u_i, \underbrace{a_{i,1}, \dots, a_{i, n}}_{\bar a_i^T}]^T\) and \(c = [v, \underbrace{c_1, \dots, c_n}_{d^T}]^T\). The dual constraint becomes \[ \begin{gathered} c - \sum_{i=1}^m y_i a_i = \begin{bmatrix} v \\ d \end{bmatrix} - \sum_{i=1}^m y_i \begin{bmatrix} u_i \\ \bar a_i \end{bmatrix} = \begin{bmatrix} v - u^T y \\ d - A^T y \end{bmatrix} \succeq 0, \\ \text{where } \begin{aligned}[t] A &= [\bar a_1, \dots, \bar a_n]^T, \\ u &= [u_1, \dots, u_n]^T, \\ \end{aligned} \end{gathered} \] That is \[ v - u^T y \ge \|d - A^T y\|_2 \] The left-hand and the right-hand side of the above are both an affine function in \(y\). The above is equivalent to \[ \begin{bmatrix} (v - u^T y) I_n & d - A^T y \\ (d - A^T y)^T & v - u^T y \end{bmatrix} \in \mathcal{S}_+^{n+1} \] To show it, firstly the case when \(v - u^T y = 0\) trivially holds. On the other hand, when \(v - u^T y > 0\), we have \[ \begin{aligned} (v - u^T y)^2 &\ge (d - A^T y)^T (d - A^T y) \\ \underbrace{(v - u^T y)}_{C} &\ge \underbrace{(d - A^T y)^T}_{B^T} \underbrace{\frac{I_n}{v - u^T y}}_{A^{-1}} \underbrace{(d - A^T y)}_{B} \end{aligned} \] Consider the Schur complement: \[ \begin{bmatrix} A & B \\ B^T & C \end{bmatrix} \in \mathcal{S}_+^{m+n} \iff \begin{gathered} A \in \mathcal{S}_+^m, C \in \mathcal{S}_+^n, \\ A - B C^{-1} B^T \in \mathcal{S}_+^m \text{ or } C - B^T A^{-1} B \in \mathcal{S}_+^n \end{gathered} \] We have, \[ \begin{bmatrix} (v - u^T y) I_n & d - A^T y \\ (d - A^T y)^T & v - u^T y \end{bmatrix} \in \mathcal{S}_+^{n+1} \] The converse similarly follows from Schur complement. Based on this result, we further claim that an SOCP is equivalent to an semi-definite programming (SDP). But it is never necessary to solve SOCP by converting it to SDP, which only increases the complexity.

Semi-definite Programming

\[ \begin{equation} \tag{SDP} \begin{aligned} \sup \quad & b^T y \\ \text{s.t.} \quad & c - \sum_{i=1}^m y_i a_i \in \mathcal{S}_+^n \\ & y \in \R^m \end{aligned} \end{equation} \]

In terms of inclusion (and thus difficulty), LP => QCQP => SOCP => SDP => CLP.

Weak Duality

Our previous derivation of \(\eqref{primal}\) and \(\eqref{dual}\) implies the weak duality of CLP.

Theorem: weak duality of CLP. Let \(\bar x\) be feasible for \(\eqref{primal}\) and \(\bar y\) be feasible for \(\eqref{dual}\). Then, \(\langle c, \bar x \rangle \ge b^T \bar y\).

Proof: \[ \begin{gathered} c - \sum_{i=1}^m y_i a_i \in K^* \Rightarrow \langle c - \sum_{i=1}^m y_i a_i, x \rangle \ge 0 \\ \Downarrow \\ \begin{aligned} \langle c, x \rangle &\ge \langle \sum_{i=1}^m y_i a_i, x \rangle \\ &= \sum_{i=1}^m \langle a_i, x \rangle y_i \\ &= b^T y \end{aligned} \end{gathered} \]

We next investigate another method to convert between the primal and dual in CLP. For a standard CLP problem, we have \[ \begin{gathered} \begin{aligned} \inf \quad & \langle c, x \rangle \\ \text{s.t.} \quad & \langle a_i, x \rangle = b_i, \forall i=1,\dots,m \\ & x \in K \subseteq E \end{aligned} \quad \substack{\langle a_i, x \rangle = b \iff b - \langle a_i, x \rangle \in \{ 0 \} \\ \equiv} \quad \begin{aligned} -\sup \quad & -\langle c, x \rangle \\ \text{s.t.} \quad & \begin{bmatrix} b \\ 0 \end{bmatrix} - \begin{bmatrix} A \\ -I \end{bmatrix} x \in \{ 0 \} \times K \end{aligned} \end{gathered} \] whose dual is \[ \begin{equation*} \begin{aligned} -\inf \quad & [b^T, 0] [y^T, \tilde y^T]^T \\ \text{s.t.} \quad & \begin{bmatrix} A^T & -I \end{bmatrix} \begin{bmatrix} y \\ \tilde y \end{bmatrix} = -c \\ & \begin{bmatrix} y \\ \tilde y \end{bmatrix} \in (\{ 0 \} \times K)^* \end{aligned} \end{equation*} \] Note that \((\{ 0 \} \times K)^* = \{ 0 \}^* \times K^* = \R \times K^*\). And interestingly, from above, we observe that the equality constraint is in essence a cone constraint.

Farkas’ Lemma

Recall the Farkas’ lemma for linear systems. Exactly one of the following two systems is solvable: \[ \begin{gather*} Ax = b, x \ge 0 \\ A^T y \le 0, b^T y > 0 \end{gather*} \] Farkas’ lemma secures a strong duality for LP. We would like to do the same to CLP. First we mimic the two systems in CLP: \[ \begin{gather*} \langle a_i, x \rangle = b_i, \forall i=1,\dots,m; x \in K \tag{I} \\ -\sum_{i=1}^m y_i a_i \in K^*; b^T y > 0 \tag{II} \end{gather*} \] Is it true that exactly one of the above two systems is solvable? We first claim that they can’t be solvable at the same time. Suppose on the contrary they both hold. By \(\mathrm{(II)}\), we have \[ \begin{aligned} b^T y =\sum_{i=1}^m \langle a_i, x \rangle y_i =\langle \sum_{i=1}^m y_i a_i, x \rangle &> 0 \\ \end{aligned} \] But that \(-\sum_{i=1}^m y_i a_i \in K^*\) implies \(\langle \sum_{i=1}^m y_i a_i, x \rangle = -\langle -\sum_{i=1}^m y_i a_i, x \rangle \le 0\), which is a contradiction. On the other hand, for CLP, it is possible that neither \(\mathrm{(I)}\) nor \(\mathrm{(II)}\) is solvable.

Example: Let \(E = \mathcal{S}^2, K = \mathcal{S}_+^2\). Let \[ A_1 = \begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix}, A_2 = \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}, b = \begin{bmatrix} 0 \\ 2 \end{bmatrix} \] Then $$ \[\begin{gathered} \begin{cases} \langle A_1, X \rangle = 0 \\ \langle A_2, X \rangle = 2 \\ X \in \mathcal{S}_+^2 \end{cases} \iff \begin{cases} X_{11} = 0 \\ 2 X_{12} = 2 \\ X \in \mathcal{S}_+^2 \end{cases} \iff \begin{bmatrix} 0 & 1 \\ 1 & X_{22} \end{bmatrix} \in \mathcal{S}_+^2 \text{, which is unsolvable} \\ \begin{cases} -(y_1 A_1 + y_2 A_2) \in \mathcal{S}_+^2 \\ 2 y_2 > 0 \end{cases} \iff \begin{cases} \begin{bmatrix} -y_1 & -y_2 \\ -y_2 & 0 \\ \end{bmatrix} \in \mathcal{S}_+^2 \\ y_2 > 0 \end{cases} \text{, which is unsolvable} \end{gathered}\]

$$

The question is what goes wrong? Recall in the proof of the original Farkas’ lemma, we apply the separation theorem to the setting \[ b \notin \{ Ax: x \in \R_+^n \} \] The right-hand set is always closed. However, for an arbitrary closed proper cone \(K\), the set \[ \{ (\langle a_1, x \rangle, \dots, \langle a_m, x \rangle): x \in K \} \] is not always closed. Without closeness, we can’t properly apply the point-set separation theorem. For the same reason, the optimum may not be attained in CLP.

Example: Continue with the \(A_1, A_2, b\) in the previous example. Consider the set \[ S = \{ (\langle A_1, X \rangle, \langle A_2, X \rangle): X \in \mathcal{S}_+^2 \} \] \((\langle A_1, X \rangle, \langle A_2, X \rangle) = (X_{11}, 2 X_{12})\) basically describes the trajectory of \(X\) as \(X\) varies in \(\mathcal{S}_+^2\). \[ X = \begin{bmatrix} X_{11} & X_{12} \\ X_{12} & X_{22} \end{bmatrix} \] If \(X_{11} = 0\), the only way to make \(X\) PSD is to make \(X_{12} = 0\); if \(X_{11} > 0\), \(X_{12}\) can be arbitrary because we can take \(X_{22}\) sufficiently large such that \(X_{11} X_{22} \ge X_{12}^2\). Therefore, \[ S = \{ (0, 0) \} \cup \{ (x, y): x > 0, y \in \R \} \]

If only we can guarantee the closeness of transformation of a proper cone!

Theorem: conic Farkas’ lemma. Suppose there exists a \(\bar y \in \R^m\) satisfying \[ -\sum_{i=1}^m \bar y_i a_i \in \intr(K^*) \tag{Slater condition} \] Then exactly one of the \(\mathrm{(I)}\) and \(\mathrm{(II)}\) is solvable.

The spirit of the Slater condition (and many its variants) is that “some vector is in the interior of some set”. For this case, it guarantees the closeness of \(K^*\).

Strong Duality

Recall the LP strong duality theorem: suppose that the primal is feasible and is bounded below (alternatively the dual is feasible and is bounded above); then, \(v_p^* = v_d^*\) and both the primal and the dual have optimal solutions. The question is what about the duality gap and the attainment of CLP?

Example: nonzero duality gap. Consider the SDP: \[ \begin{aligned} v_p^* = \inf \quad & y_1\\ \text{s.t.} \quad & \begin{bmatrix} 0 & y_1 & 0 \\ y_1 & y_2 & 0 \\ 0 & 0 & 1+y_1 \end{bmatrix} \in \mathcal{S}_3^+ \end{aligned} \] \(y_1 = 0, y_2 = 0\) is the only solution to it. Thus, \(v_p^* = 0\). It can be rewritten as $$ \[\begin{aligned} v_p^* = - \sup \quad & \underbrace{[-1, 0]}_{b^T} [y_1, y_2]^T \\ \text{s.t.} \quad & \underbrace{ \begin{bmatrix} 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 1 \\ \end{bmatrix} }_{C} - y_1 \underbrace{ \begin{bmatrix} 0 & -1 & 0 \\ -1 & 0 & 0 \\ 0 & 0 & -1 \end{bmatrix} }_{A_1} - y_2 \underbrace{ \begin{bmatrix} 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & -1 \\ \end{bmatrix} }_{A_2} \in \mathcal{S}_3^+ \end{aligned}\] \[ Its dual is \] \[\begin{aligned} v_d^* = -\inf \quad & X_{33} \\ \text{s.t.} \quad & -2 X_{12} - X_{33} = -1 \\ & -X_{22} = 0 \\ & X \in \mathcal{S}_+^3 \end{aligned}\]

$$ The dual is feasible and \(X_{33}\) can only be \(1\). Thus, \(v_d^* = -1\).

Note that the duality gap is nonzero.

Example: non-attainment. Consider the SOCP: \[ \begin{aligned} v_p^* = \sup \quad & -y_1 \\ \text{s.t.} \quad & (y_1 + y_2, 1, y_1 - y_2) \in \mathcal{Q}^3 \end{aligned} \] The constraint is \[ y_1 + y_2 \ge \sqrt{1 + (y_1 - y_2)^2} \iff 4 y_1 y_2 \ge 1 \] This implies that \(y_1, y_2 > 0\). Obviously, \(v_p^* = 0\) but cannot be attained. It can be rewritten as \[ \begin{aligned} v_p^* = \sup \quad & [-1, 0] [y_1, y_2]^T \\ \text{s.t. }\quad & \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} - y_1 \begin{bmatrix} -1 \\ 0 \\ -1 \end{bmatrix} - y_2 \begin{bmatrix} -1 \\ 0 \\ 1 \end{bmatrix} \in \mathcal{Q}^3 \end{aligned} \] Its dual is \[ \begin{aligned} v_d^* = \inf \quad & z_2 \\ \text{s.t.} \quad & -z_1 - z_3 = -1 \\ & -z_1 + z_3 = 0 \\ & (z_1, z_2, z_3) \in \mathcal{Q}^3 \end{aligned} \] The only solution is \((1/2, 0, 1/2)\) and the \(v_d^* = 0\) is attained at this point.

Despite the above, can we draw anything about the duality gap and the attainment of CLP?

Theorem: strong duality of CLP. Suppose \(\eqref{primal}\) is bounded and satisfies Slater’s condition, i.e. there exists a feasible \(\bar x\) such that \(\bar x \in \intr(K)\). Then, \(v_p^* = v_d^*\) and there exists an \(y^*\) that is optimal to \(\eqref{dual}\).

On the other hand, suppose \(\eqref{dual}\) is bounded and satisfies Slater’s condition, i.e. there exists a feasible \(\bar y\) such that \(c - \sum_{i=1}^m \bar y_i a_i \in \intr(K^*)\). Then, \(v_p^* = v_d^*\) and there exists an \(x^*\) that is optimal to \(\eqref{primal}\).

With Slater’s condition on one side, we can guarantee the zero duality gap and the attainment only on the other side. The CLP strong duality is much “weaker” than the LP strong duality. Refer to the non-attainment example.

Pay attention what CLP strong duality does not say as well. Even though the Slater’s condition is not satisfied, zero duality gap and attainment can hold. Refer to the nonzero duality gap example.

Previous
Next