Lipschitz Continuity

For a continuous mapping \(f\), it is \(K\)-Lipschitz continuous if there exists a number \(K\) such that \(\forall x,y \in \dom(f)\) \[ ||f(x) - f(y)|| \le K||x - y|| \] If the gradient of \(f\) is \(K\)-Lipschitz continuous, we further say \(f\) is \(K\)-smooth.

Lipschitz Constant

If \(K\) is the minimum number to make the above condition hold, then \(K\) is called the Lipschitz constant.

The Lipschitz constant for a general differentiable function \(f\) will be the maximum spectral norm of its gradient \(\nabla f\) over its domain. \[ ||f||_{Lip} = \sup_x \sigma[\nabla f(x)] = \sup_x \sup_{||v||=1} \nabla f(x) \cdot v \] where \(\sigma[\nabla f(x)]\) denotes the spectral norm of \(f\)’s gradient at \(x\).

The Lipschitz constant for a matrix transformation will be the matrix’s spectral norm.
The Lipschitz constant for a \(\R \mapsto \R\) function will be its largest subgradient over its domain

Composition of Functions

Suppose two functions \(f\) and \(g\) are Lipschitz continuous respectively. Then, \[ \nabla (f \circ g)(x) = \nabla f [g(x)] \nabla g(x) \] by the chain rule of derivatives. \(f \circ g\)’s Lipschitz constant will be \[ \begin{aligned} \sigma(\nabla (f \circ g)(x)) &= \sup_{||v|| = 1} ||\{\nabla f[g(x)] \nabla g(x)\} v|| \\ &= \sup_{||v|| = 1} \{||\nabla g(x) v||\} \{||\nabla f[g(x)] \frac{\nabla g(x) v}{||\nabla g(x) v||}||\} \\ &\le \sup_{||v|| = 1} \sigma[\nabla g(x)] \{||\nabla f[g(x)] \frac{\nabla g(x) v}{||\nabla g(x) v||}||\} \\ &= \sigma[\nabla g(x)] \sup_{||v|| = 1} \{||\nabla f[g(x)] \frac{\nabla g(x) v}{||\nabla g(x) v||}||\} \\ &= \sigma[\nabla g(x)] \cdot \sigma\{\nabla f[g(x)]\} \end{aligned} \] In other words, \(||f \circ g||_{Lip} = ||f||_{Lip} \cdot ||g||_{Lip}\)

Lipschitz Continuity, convexity, subgradients – Marco Tulio Ribeiro – (washington.edu)

Lipschitz continuous gradient · Xingyu Zhou’s blog

Last updated on Jan 31, 2022