Optimization

Lagrange Multiplier

Standard Form The standard form of the Lagrange multiplier optimization problem is \[ \begin{aligned} &\inf f_0(x) \\ s.t.\quad & f_i(x) \le 0, i = 1,\dots,r \\ & h_i(x) = 0, i = 1,\dots,s \\ \end{aligned} \] Denote the feasible set of $x$ that satisfies all the requirements in the above problem as $\mathcal X \subseteq \R^n$.
Convex Optimization

Definitions Definitions Convex Set A set $\mathcal X$ is called a convex set if $x^{(1)}, x^{(2)} \in \mathcal X$, then \(\forall \alpha \in [0,1], \alpha x^{(1)} + (1 - \alpha)x^{(2)}
Gradient Descent

If a convex function $f$ is differentiable and satisfies certain “regularity conditions”, we can get a nice guarantee that $f$ will converge by gradient descent. $L$-smoothness Qualitatively, smoothness means that the gradient of $f$ changes in a controlled, bounded manner.
Coordinate Descent

Convex and Differentiable Function Given a convex, differentiable $f: \R^n \mapsto \R$, if we are at a point $x$ such that $f(x)$ is minimized along each coordinate axis, have we found a global minima?
Expectation Maximization

If a probabilistic model contains only observable variables, maximum likelihood estimation or Bayesian methods can be adopted to derive the model parameters. However it is also possible that a probabilistic model contains unobservable variables (called latent variables).
Subgradient

Definition Subgradient of a function $f: \R^n \mapsto \R$ at $x$ is defined as \[ \partial f(x) = \{g|f(y) \ge f(x) + g^T(y - x), \forall y \in dom(f) \}
Least Angle Regression

Forward Selection In cases where we are to solve $W$ so that $Y = XW$, where $Y \in \R^M, X \in \R^{M \times N}, W \in \R^N$, we can do it in an iterative and greedy way.
Convex Conjugate For a convex function $f$, its convex conjugate $f^$ is defined as $$ f^(t) = \sup_x [x^T \cdot t - f(x)] $$ By definition, a Convex Conjugate pair

Last updated on Jul 10, 2022

Optimization

Lagrange Multiplier

Convex Optimization

Gradient Descent

Coordinate Descent

Expectation Maximization

Subgradient

Least Angle Regression