Coordinate Descent
Convex and Differentiable Function
Given a convex, differentiable \(f: \R^n \mapsto \R\), if we are at a point \(x\) such that \(f(x)\) is minimized along each coordinate axis, have we found a global minima? That is, does \[ \forall d, i,f(x + d \cdot e_i) \ge f(x) \Rightarrow f(x) = \min_zf(z), \text{ where $e_i$ is the $i$-th standard basis ?} \] The answer is yes. This is because \[ \nabla f(x) = \left[ \begin{array} \\ \frac{\partial f}{\partial x_1}, \cdots, \frac{\partial f}{\partial x_n} \end{array} \right] = 0 \]
Convex but Non-differentiable Function
If \(f\) is only convex but not differentiable, the above will not necessarily hold. However, if \(f\) can be decomposed such that \[ f(x) = g(x) + \sum_{i=1}^nh_i(x_i) \] where \(g(x)\) is convex and differentiable, and each \(h_i(x_i)\) is convex but possibly non-differentiable, the global minima still holds. For any \(y\), \[ \begin{aligned} f(y) - f(x) &= g(y) - g(x) + \sum_{i=1}^nh_i(y_i) - \sum_{i=1}^nh_i(x_i) \\ &\ge \nabla g(x)^T(y - x) + \sum_{i=1}^n[h_i(y_i) - h_i(x_i)] \\ &= \sum_{i=1}^n[\nabla_i g(x)(y_i - x_i) + h_i(y_i) - h_i(x_i)] \end{aligned} \] Because \(f(x)\) obtains minimum along each axes, for any \(d\) and \(k\), \[ \begin{aligned} f(x + d \cdot e_k) &\ge f(x) \\ g(x + d \cdot e_k) + \sum_{i=1,i\ne k}^nh_i(x_i) + h_k(x_k + d) &\ge g(x) + \sum_{i=1}^nh_i(x_i) \\ g_k(x_k + d) - g_k(x_k) + h_k(x_k + d) - h_k(x_k) &\ge 0 \\ \end{aligned} \] By the linearity of subgradient, \(0 \in \partial(g_k + h_k) = \nabla_k g + \partial h_k \Rightarrow -\nabla_k g \in \partial h_k\). That is \[ h_k(y_k) - h_k(x_k) \ge -\nabla_k g(x)(y_k - x_k) \] Then, \[ f(y) - f(x) = \sum_{i=1}^n[\nabla_i g(x)(y_i - x_i) + h_i(y_i) - h_i(x_i)] \ge 0 \]
<<<<<<< HEAD Coordinate Descent.pdf ======= ## External Materials >>>>>>> notes
Coordinate Descent.pdf || Coordinate Descent in One Line, or Three if Accelerated | A Butterfly Valley (wordpress.com)