4-linear-programming
Linear Programming
Recall that the linear programming problem is \[ \label{lp} \begin{aligned} \min_{x} \quad& c^T x \\ \text{s.t.} \quad& a_i^T x \le b_i, i=1,\dots,m \end{aligned} \tag{LP} \] where \(c \in \R^n\) and \(a_i \in \R^{n}, b_i \in \R, i=1,\dots,m\).
Definition: A polyhedron is the intersection of a finite set of halfspaces. A bounded polyhedron is called a polytope.
Therefore, the feasible set of LP is a polyhedron. Interestingly, all polyhedrons are convex. That is, the feasible set of LP is convex.
Standard Form
The standard form of LP is \[ \label{standard-lp} \begin{aligned} \min_{x \in \R_+^n} \quad & c^T x \\ \text{s.t.} \quad & A x = b \\ \end{aligned} \tag{Standard LP} \] where \(c \in \R^n, A \in \R^{m \times n}, b \in \R^m\). Standard LP has standard solutions. Thus, the next question is how to convert an ordinary LP problem to a standard one.
Now write \(x = x^+ - x^-\), where \(x^+, x^- \in \R_+^n\). Introduce another slack variable \(s \in \R_+^m\). \(\eqref{lp}\) is equivalent to \[ \begin{aligned} \min_{x^+, x^- \ge 0, s \ge 0} \quad & c^T (x^+ - x^-) \\ \text{s.t.} \quad & A (x^+ - x^-) + s = b \\ \end{aligned} \] Let \[ A' = \left[\begin{array}{c:c} A & -A & I \end{array}\right] \in \R^{m \times (2n+m)} \\ x' = \begin{bmatrix} x^+ \\ \hdashline x^- \\ \hdashline s \end{bmatrix} \in \R_+^{2n+m}, c' = \begin{bmatrix} c \\ \hdashline -c \\ \end{bmatrix} \in \R_+^{2n} \] The problem becomes \[ \begin{aligned} \min_{x' \in \R_+^{2n+m}} \quad & (c')^T x' \\ \text{s.t.} \quad & A' x' = b \\ \end{aligned} \] which observes the formality of standard LP. Here lays down again the standard LP problem:
\[ \label{primal} \begin{aligned} v_p^* = \min_{x \in \R_+^n} \quad & c^T x \\ \text{s.t.} \quad & A x = b \\ \end{aligned} \tag{P} \]
The first question to ask is whether there is a optimal solution to it. The answer is:
Proposition: If \(\eqref{primal}\) is feasible, then either 1) the optimal value \(v^* = -\infty\) (no optimal solution),or 2) it has an optimal solution.
Farkas’ Lemma
But the second question arises: how to certify that \(\eqref{primal}\) is infeasible? We meet the similar dilemma to that when dealing with set-set separation: it is easy to test the feasibility of any \(x\); but it is prohibitive to test the feasibility of all \(x\)’s just to show that the original problem is infeasible.
The idea is that, given the feasible polyhedron \(P \triangleq \{ x \in \R_+^n: Ax = b \}\) of the original problem, we construct another auxiliary polyhedron \(Q\) such that exactly one of the following holds:
- \(P \ne \emptyset, Q = \emptyset\);
- \(P = \emptyset, Q \ne \emptyset\).
Theorem: Farkas’ lemma. Let \(A \in \R^{m \times n}\) and \(b \in \R^m\) be given. Then exactly one of the following systems is solvable:
- \(A x = b, x \ge 0\);
- \(A^T y \le 0, b^T y > 0\).
Interpretation:
\(A x = b, x \ge 0\) means that \(b\) is a non-negative linear combinations of \(a_1, \dots, a_n\). \(A^T y \le 0\) means that \(y\) forms an obtuse angle to \(a_1, \dots, a_n\); \(b^T y > 0\) means \(y\) forms an acute angle with \(b\).
Proof:
We claim that the above two statements cannot be solvable at the same time. or else this gives rise to the contradiction for some \(x_0 \ge 0, y_0 \in \R^m\): \[ \underbrace{x_0^T}_{\ge 0} \underbrace{A^T y_0}_{\le 0} = y_0^T A x = \underbrace{y_0^T b}_{> 0} \] We then claim that the above two statements cannot be unsolvable at the same time. Suppose \(1\) is unsolvable.
In this case, \(b \notin A^+ \triangleq \{ Ax: x \ge 0 \}\). It can be verified that \(A^+\) is non-empty (\(0 \in A^+\)), closed (closeness is not generally preserved under affine transformation; but in this case it is <refer to this handout>) and convex (because convexity is preserved under affine transformation \(A^+ = A \R_+^n\)). Then by point-set separation theorem, for any \(x \ge 0\), there exists a \(y_0 \in \R^m\) such that \[ \max_{x \ge 0} y_0^T A x < y_0^T b \] Take \(x = 0\), we have \[ y_0^T b > 0 \] We claim that \(A^T y_0 \le 0\). Suppose on the contrary that there exists an \(i\) such that \((A^T y_0)_i\) is positive. Then for any \(\lambda \ge 0\), take \(x = \lambda e_i\) and give \[ y_0^T A x = x^T A^T y_0 = \underbrace{\lambda}_{\ge 0} \underbrace{(A^T y_0)_i}_{> 0} < y_0^T b \] which is impossible when \(\lambda \to \infty\).
Therefore, \(2\) is solvable when \(1\) is unsolvable. \(1\) and \(2\) cannot be unsolvable at the same time.
We construct \(Q'\) as \(\{ y \in \R^m: A^T y \le 0, b^T y > 0 \}\). Note that \(Q'\) is not a polyhedron yet, because \(\{ y \in \R^m: b^T y > 0 \}\) is open and is not a half-space. However, observe that (verify it) \[ A^T y \le 0, b^T y > 0 \text{ is solvable} \iff A^T y \le 0, b^T y = 1 \text{ is solvable} \] \(\{ y \in \R^m: b^T y = 1 \}\) can be rewritten as \(\{ y \in \R^m: b^T y \ge 1, b^T y \le 1 \}\) which is an intersection of halfspaces. Therefore, \(Q \triangleq \{ y \in \R^m: A^T y \le 0, b^T y \ge 1, b^T \le 1 \}\) is non-empty if and only if \(Q'\) is non-empty. Moreover, \(Q\) is an intersection of half-spaces as desired.
Now, we convert the infeasibility of \(P\) into the feasibility of \(Q\). The third question is, suppose we verify that \(P\) is feasible, then given a solution to \(\eqref{primal}\), how to certify its optimality? The idea is to establish a lower bound on the optimal value \(v^*_p\).
Duality
Consider a \(y \in \R^m\) such that \(A^T y \le c\), then for any \(x \in \R_+^n\) such that \(b \triangleq A x\), we have \[ b^T y = x^T A^T y \le c^T x \] The above holds for \(x^*\) of \(\eqref{primal}\) as well. Therefore, \[ b^T y \le c^T x_p^* = v_p^* \] We can try to find the largest lower bound w.r.t. \(y\): \[ \label{dual} \begin{aligned} v_d^* = \max_y &\quad b^T y \\ \text{s.t.} &\quad A^T y \le c \end{aligned} \tag{D} \] Automatically, \(v_d^* \le v_p^*\). Note that \(\eqref{primal}\) is the also the dual of \(\eqref{dual}\) in that \(v_p^* \ge v_d^*\).
Theorem: weak duality of LP. Let \(\bar x\) be feasible for \(\eqref{primal}\) and \(\bar y\) be feasible for \(\eqref{dual}\). Then, \[ c^T \bar x \ge b^T \bar y \]
Corollary:
- If \(v_p^* = -\infty\), then \(\eqref{dual}\) is infeasible.
- If \(v_d^* = +\infty\), then \(\eqref{primal}\) is infeasible.
- If \(\bar x\) is feasible for \(\eqref{primal}\), \(\bar y\) is feasible for \(\eqref{dual}\), and the duality gap \(\Delta(\bar x, \bar y) \triangleq c^T \bar x - b^T \bar y\) is zero, then \(\bar x\) is optimal for \(\eqref{primal}\) and \(\bar y\) is optimal for \(\eqref{dual}\).
What about the converses of conclusions above?
If \(\eqref{dual}\) is infeasible, then either \(v_p^* = -\infty\) or \(\eqref{primal}\) is infeasible.
It is rather easy to construct a problem such that both the \(\eqref{primal}\) and \(\eqref{dual}\) are infeasible, since the constraints \(Ax = b, x \ge 0\) and \(A^T y \le c\) are quite independent. The following will give an example: \[ A = \begin{bmatrix} -1 & -1 \\ 1 & 1 \end{bmatrix}, b = [1, 1]^T, c = [-1, -1]^T \]
If \(\eqref{primal}\) is infeasible, then either \(v_d^* = +\infty\) or \(\eqref{dual}\) is infeasible.
Theorem: strong duality for LP. Suppose \(\eqref{primal}\) has an optimal solution \(x_p^*\). Then \(\eqref{dual}\) has an optimal solution \(y_d^*\) and \(v_p^* = c^T x_p^* = b^T y_d^* = v_d^*\).
Proof:
We claim that \[ Ax = b, x \ge 0, c^T x - v_p^* < 0 \tag{I} \] is unsolvable. Consider homogenizing the above system: \[ Ax - bt = 0, c^T x - v_p^* t < 0, x \ge 0, t \ge 0 \tag{II} \] We argue that \(\textrm{(I)}\) is solvable if and only if \(\textrm{(II)}\) is solvable. If \(\bar x\) solves \(\textrm{(I)}\), then \((\bar x, 1)\) solves \(\textrm{(II)}\) too. If \((\bar x, \bar t)\) solves \(\textrm{(II)}\), we discuss by case.
- \(\bar t > 0\). In this case, \(\bar x / \bar t\) solves \(\textrm{(I)}\).
- \(\bar t = 0\). In this case, \(x^* + \bar x\) solves \(\textrm{(I)}\).
\(\textrm{(II)}\) can be rewritten as \[ \begin{bmatrix} A & -b \\ -A & b \\ -I & 0 (n \times 1) \\ 0 (1 \times m) & -1 \end{bmatrix} \begin{bmatrix} x \\ t \end{bmatrix} \le 0, [-c^T, v_p^*] \begin{bmatrix} x \\ t \end{bmatrix} > 0 \] By Farkas’ lemma, the following system is solvable: \[ \begin{bmatrix} A^T & -A^T & -I & 0 (m\times 1) \\ -b^T & b^T & 0 (1 \times n) & -1 \end{bmatrix} \begin{bmatrix} z_1 \\ z_2 \\ z_3 \\ z_4 \end{bmatrix} = \begin{bmatrix} -c \\ v_p^* \end{bmatrix}, \begin{bmatrix} z_1 \\ z_2 \\ z_3 \\ z_4 \end{bmatrix} \ge 0 \]
This means \[ A^T (z_1 - z_2) - z_3 = -c \Rightarrow \\ A^T (z_2 - z_1) \le c \\ \]
\[ -b^T (z_1 - z_2) - z_4 = v_p^* \Rightarrow \\ b^T (z_2 - z_1) \ge v_p^* \] \(y^* \triangleq z_2 - z_1\) is a feasible solution to the \(\eqref{dual}\). Plus the weak duality which states \(b^T y^* \le v_p^*\), \(b^T y^* = v_p^*\). Therefore, \(y_d^* = y^*\) and \(c^T x_p^* = b^T y_d^*\).
Corollary: complementary slackness. Let \(\bar x\) be feasible for \(\eqref{primal}\) and \(\bar y\) be feasible for \(\eqref{dual}\). Then, they are optimal for their respective problems if and only if \[ \underbrace{\bar x_i}_{\substack{\text{$i$-th} \\ \text{primal variable}}}\ \underbrace{(c - A^T \bar y)_i}_{\substack{\text{$i$-th } \\ \text{dual constraint}}} = 0 \] Proof: \[ \begin{aligned} c^T \bar x - b^T \bar y &= c^T x - (A \bar x)^T \bar y \\ 0 &= \bar x^T (c - A^T \bar y) \end{aligned} \] Plus that \(\bar x \ge 0, c - A^T \bar y \ge 0\), we can conclude with the complementary slackness.
After establishing the lower bound of \(\eqref{primal}\)’s optimal value with \(\eqref{dual}\)’s optimal value, by strong duality, to find optimal solutions to \(\eqref{primal}\) and \(\eqref{dual}\) is equivalent to find a feasible solution to \[ \begin{gather} Ax = b, x \ge 0 \tag{primal constraint} \\ A^T y \le c \tag{dual constraint} \\ c^T x = b^T y \tag{zero duality gap} \end{gather} \] Note that the zero duality gap is in essence equivalent to \[ x^T (c - A^T y) = 0 \tag{complementary slackness} \] An optimization problem is converted to feasibility problem. Particularly, an LP optimization problem is no harder than an LP feasibility problem.
Examples of LP Problem
Vertex Cover
Given a graph \(G = (V, E)\) and a cost function \(c: V \mapsto \R^+\), find a vertex cover that minimizes overall cost. Here we say that \(S \subseteq V\) is vertex cover if every edge has at least one endpoint in \(S\). The cost of a vertex cover the sum of costs of the vertices contained.
Let \(x_i \in \{ 0,1 \}\) be an indicator variable defined as \[ x_i = \begin{cases} 1, & \text{if vertex $i$ is in the cover} \\ 0, & \text{otherwise} \end{cases} \] Then we have the following integer programming problem: \[ \label{vc} \begin{aligned} v^* = \min_x \quad &\sum_i x_i c_i \\ \text{s.t.} \quad &x_i + x_j \ge 1, \forall (i,j) \in E \\ &x_i \in \{ 0,1 \}, \forall i=1,\dots,|V| \end{aligned} \tag{Vertex Cover} \] The problem above is NP-hard. One typical way to tackle it is to relax the constraint to make it easier. \[ \label{vc-lp-I} \begin{aligned} v_\text{LP}^* = \min_x \quad & \sum_i x_i c_i \\ \text{s.t.} \quad & x_i + x_j \ge 1, \forall (i,j) \in E \\ & 0 \le x_i \le 1, \forall i=1,\dots,|V| \end{aligned} \tag{Relaxed VC I} \] Note that we can further drop the \(x_i \le 1\) in \(\eqref{vc-lp-I}\) because we can always let \(x_i' = 1\) for \(x_i > 1\) without violating the constraints but with smaller objective (because \(c \ge 0\)): \[ \label{vc-lp-II} \begin{aligned} v_\text{LP}^* = \min_x \quad & \sum_i x_i c_i \\ \text{s.t.} \quad & x_i + x_j \ge 1, \forall (i,j) \in E \\ & 0 \le x_i, \forall i=1,\dots,|V| \end{aligned} \tag{Relaxed VC II} \] Once \(x_\text{lp}^*\) solves \(\eqref{vc-lp-II}\), we need to round \(x_\text{lp}^*\) to give a feasible \(x_\text{rd}\) to \(\eqref{vc}\) such that \(v_\text{rd} = c^T x_\text{rd} \approx v^*\). Obviously, \(v^* \ge v_\text{lp}^*\) and \(v^* \le v_\text{rd}\). The question is if we can upper-bound \(v_\text{rd}\) by \(\alpha v^*\) for some \(\alpha \ge 1\). Here \(\alpha\) is called the approximation ratio.
Definition: Let \(P = \{ x \in \R^n: a_i^T x \le b, i=1,\dots,m \}\) be a polyhedron and let \(\bar x \in P\). Let \(I(\bar x) = \{ i: a_i^T x = b \}\) be the active index set. We say that \(\bar x\) is a vertex of \(P\) if \(\{ a_i: i \in I(\bar x) \}\) has \(n\) linearly-independent vectors. Alternatively, the system \[ a_i^T x = b_i, i \in I(\bar x) \] has a unique solution (which is \(\bar x\)).
The equality constraint ensures that the solution is on the boundary of the feasible set, or rather, on a hyperplane. \(n\) linearly-independent vectors ensure that the solution is the intersection of \(n\) hyperplanes, which is why it is called vertex solution.
Proposition: If an LP has a vertex feasible solution and is bounded (which means the optimal value is finite), then it has a vertex optimal solution (??).
Theorem: Let \(Q = \{ x \in \R^{|V|}: x_i + x_j \ge 1, \forall (i,j) \in E; x \ge 0 \}\), which is exactly the feasible set of \(\eqref{vc-lp-II}\). Then \(Q\) has a vertex solution (??). Let \(\bar x\) be one of such vertex solutions. Then, \[ \forall i, \bar x_i \in \{ 0, 1/2, 1 \} \]
By the theorem and proposition above, and because \(\eqref{vc-lp-II}\) is bounded (verify it), we can have a vertex optimal solution \(x_\text{lp}^*\) to \(\eqref{vc-lp-II}\). Next, we choose to round the optimal vertex solution \(x_\text{lp}^*\) in this way: if \(x_{\text{lp}_i}^*\) is \(0\), \(x_{\text{rd}_i} = 0\); else \(x_{\text{rd}_i} = 1\). Note that this rounding method satisfies the constraint of \(\eqref{vc}\). The largest divergence of \(v_\text{rd}\) from \(v_\text{lp}^*\) happens when \(x_\text{lp}^*\) is full of \(1/2\) or \(0\), in which case \(v_\text{rd} = 2 v_\text{lp}^*\). Thus, we can conclude that \[ v_\text{lp}^* \le v^* \le c^T x_\text{rd} \le 2 v_\text{lp}^* \le 2 v^* \]
The approximation ratio in this problem is \(\alpha = 2\).
Max Flow
cs.cmu.edu/~odonnell/toolkit13/lecture14.pdf
Cooperative Game Theory
Let \(\mathcal{N} = \{ 1,\dots,n \}\) be the set of players and \(v: 2^N \mapsto \R^+\) be the worth function, which satisfies \(v(\emptyset) = 0\). The subset of \(\mathcal{N}\) is called coalition. Let \(x \ge 0\) be the allocation vector and \(x_i\) be the payoff assigned to player \(x_i\). Let \(x(\emptyset) = 0\) and \(x(S) \triangleq \sum_{i \in S} x_i\) for non-empty coalition \(S\). We say that coalition \(S\) improves upon \(x\) if \(v(S) > x(S)\).
Definition: The allocation vector \(x\) is said to be in the core if \[ \begin{gather} x(\mathcal{N}) = v(\mathcal{N}) \\ \forall S \subseteq \mathcal{N}, x(S) \ge v(S) \label{core-inequality} \end{gather} \] which means no \(S\) improves upon \(x\).
It is essential to understand the implication behind the game. This “cooperative game” is actually preventing players from cooperation, or collusion, or called forming a coalition. The allocation vector is part of this “evil scheme”. The fact of matter is, the solver of this problem is trying to pay off players (in total the amount is no more than the value he thinks the whole coalition \(\mathcal{N}\) deserves) such that for every possible coalition, there will be some member who thinks it unfair because the share (equal share for everyone!) he gets from the coalition is no more than the amount he would otherwise earns himself. \[ x(S) = \sum_{i \in S} x_i \ge v(S) \] \(x(S)\) is additive w.r.t. \(S\), which exactly represents the payoff every player grabs (snout in the trough) covertly and separately. There must be some \(i\) in \(S\) such that \(x_i \ge v(S)/|S|\). The situation worsens when the \(\ge\) relation is strict.
The question is, is the core non-empty for a given \((\mathcal{N}, v)\), or rather, can such scheme exist?
Example: Treasure hunt. \(n\) people discovered treasures in the mountain. 2 people are needed to bring one piece out and thus \(v(S) = \lfloor \frac{|S|}{2} \rfloor\). Is the core empty?
First consider the case when \(n\) is even. \(x = [1/2, \dots, 1/2]\) is in the core.
Now simply consider the case when \(n\) is odd. The fact is that the core is empty in this case.
Theorem: Bondareva-Shapeley theorem. The cooperative game \((\mathcal{N}, v)\) has a non-empty core if and only if for every set of numbers \(\{ y_S \}_{S \subseteq \mathcal{N}}\), whose elements are indexed by \(\mathcal{N}\)’s subset, such that
- \(\forall S \subseteq \mathcal{N}, y_S \ge 0\) and
- \(\forall i \in \mathcal{N}, \sum_{S: i \in S} = 1\) and
- \(\sum_{S \subseteq \mathcal{N}} y_S v(S) \le v(\mathcal{N})\).
Interpretation:
\(y_S\) can be interpreted as the amount of time each player in \(S\) spent on the coalition \(S\), justifying the non-negativity constraint. The total amount of time player \(i\) spent on different coalitions is \(1\). \(y_S v(S)\) can be understood as the proportional outcome from partial-commitment.
Proof:
Consider the following optimization problem: \[ \label{cgp} \tag{Coop. Game} \begin{aligned} \min \quad & [x(\mathcal{N}) \triangleq \sum_{i \in N} x_i] \\ \text{s.t.} \quad & x(S) \ge v(S), \forall S \subseteq \mathcal{N} \end{aligned} \] Observe that the constraints naturally imply the inequality conditions \(\eqref{core-inequality}\) for an allocation to be in the core. \(x \ge 0\) is automatically embedded into \(x(\{ i \}) \ge v(\{ i \}) \ge 0, i=1,\dots,n\). Other than that, the minimizing objective together with the constraint \(x(\mathcal{N}) \ge v(\mathcal{N})\) implies \(\eqref{cgp}\) is lower-bounded by \(v(\mathcal{N})\). Also note that \(\eqref{cgp}\) is always feasible because we can assign each \(x(S)\) sufficiently large to surpass \(v(S)\). Therefore, \(\eqref{cgp}\) always has an optimal solution.
To rewrite \(\eqref{cgp}\) to LP form, let \[ e = [1,\dots,1]^T, A = \begin{bmatrix} \mathbb{1}_n(\emptyset)^T \\ \vdots \\ \mathbb{1}_n(S)^T \\ \vdots \\ \mathbb{1}_n(\mathcal{N})^T \end{bmatrix}, b = \begin{bmatrix} v(\emptyset) \\ \vdots \\ v(S) \\ \vdots \\ v(\mathcal{N}) \end{bmatrix} \] where \(\mathbb{1}_\mathcal{N}(S)\) is the indicator function that returns the vector in \(\R^n\) whose \(i\)-th entry indicates if \(i \in S\). Then \(\eqref{cgp}\) becomes \[ \label{cgp-lp} \tag{Coop. Game LP} \begin{aligned} v^* = \min \quad & e^T x \\ \text{s.t.} \quad & A x \ge b \end{aligned} \] We don’t rush to convert it to standard LP form yet; otherwise we need to introduce a slack variable. On the other hand, we formulate the following LP problem, whose optimal value is of the same magnitude as but of different sign to \(\eqref{cgp-lp}\): \[ \label{cgp-primal} \tag{Coop. Game P} \begin{aligned} v_p^* = \max \quad & -e^T x \\ \text{s.t.} \quad & -A x \le -b \end{aligned} \] That is, \(v^* = -v_p^*\). Recall that the primal and the dual are relative. The above problem has the dual \[ \label{cgp-dual} \tag{Coop. Game D} \begin{aligned} v_d^* = \min_{y \ge 0} \quad & -b^T y \\ \text{s.t.} \quad & A^T y = e \end{aligned} \] which is in the standard LP form.
Note that columns of \(A^T\) are exactly those indicator vectors. Thus, \(A^T\)’s columns and \(y\)’s entries can both be indexed by sets. Also, rows of \(A^T\) are indexed by elements. Since every entry of \(A^T y\) is \(1\), interpreting \(A^T y\) as the row-column dot-product, the constraint in \(\eqref{cgp-dual}\) is exactly \[ \forall i \in \mathcal{N}, \sum_{S: i \in S} y_S = 1 \] By strong duality, we have \[ v^* = x^*(\mathcal{N}) = \sum_{S} y_S^* v(S) = -v_d^* = -v_p^* \] \(\forall S \subseteq \mathcal{N}, x^*(S) \ge v(S)\) by the constraint of \(\eqref{cgp-lp}\). By Bondareva-Shapeley theorem’s condition, we have \(x^*(\mathcal{N}) = \sum_{S} y_S^* v(S) \le v(\mathcal{N})\). Therefore, we conclude that \(x^*(\mathcal{N}) = v(\mathcal{N})\). \(x^*\) is in the core.
It can be inferred that there are lots of duality in game theory, given its mini-max nature.
Solving LP
The algorithm for solving LP problems generally falls into two methods: simplex method and interior point method.