Least Squares

Suppose we are solving the \(Ax = b\) problem. \(b\) does not always lie in the column space of \(A\). However, we can try to find within \(A\)’s column space a vector \(\hat x\) such that \(A\hat x\) best approximates \(b\). By best approximation we mean to minimize the \(||Ax - b||\) over all \(x \in \R^n\).

The best approximation can be achieved when \(Ax = \hat b = \mathop{proj}_{Col(A)}b\).

Instead of finding a orthogonal basis for \(A\), computing \(\hat b\) and then solving \(Ax = \hat b\), we can derive \(x\) in this way: \[ \begin{gather} (b - \hat b) \perp Col(A) \iff (b - \hat b) \in Nul(A^T)\iff A^T(b - \hat b) = 0 \\ A^T(b - Ax) = 0 \\ A^TAx = A^Tb \label{solution} \end{gather} \]

We will show that if columns of \(A\) are independent, then the least-square solution \(\hat x\) is uniquely given by \((A^TA)^{-1}A^Tb\)

Firstly, \(Nul(A) = Nul(A^TA)\). This is because \[ \begin{gather} Ax = 0 \Rightarrow A^TAx = A^T0 = 0 \\ A^TAx = 0 \iff x^TA^TAx = 0 \iff (Ax)^TAx = 0 \Rightarrow Ax = 0 \end{gather} \] When columns of \(A\) are independent, \(Nul(A) = 0\) so that \(Nul(A^TA) = 0\), which indicates that equation \(\eqref{solution}\) has the unique solution. Conversely, as an aside, when \(A^TA\) is invertible, \(Nul(A^TA) = 0\) so that \(Nul(A) = 0\), which indicates that columns of \(A\) are independent.

Previous