Eigenvectors and Eigenvalues

Eigenvectors and Eigenvalues

An eigenvector of a \(n \times n\) matrix \(A\) is a nonzero vector \(\rm x\) such that \(A\rm x = \lambda \rm x\) for some scalar \(\lambda\). This scalar \(\lambda\) is the corresponding eigenvalue. Note that by definition, \((\lambda, \vec 0)\) is a pair of eigenvalue and eigenvector of any square matrix. However, \(\vec 0\) is just too trivial an eigenvector that people exclude it from the eigen discussion.

Note, though, \(0\) can be an eigenvalue. Also note that if \((\lambda, v)\) is a pair of eigen of matrix \(A\), then \((\lambda, kv)\) is also a pair of eigen.

Similarity

If \(A\) and \(B\) are \(n \times n\) matrices, then \(A\) is similar to \(B\) if there is an invertible matrix \(P\) such that \(PAP^{-1} = B\), or equivalently \(P^{-1}BP = A\).

If \(A\) and \(B\) are similar, then they have the same characteristic polynomial and hence then the same eigenvalues: \[ B - \lambda I = PAP^{-1} - \lambda PP^{-1} = P(A - \lambda I)P^{-1} \\ \]

\[ \begin{aligned} &\det(B - \lambda I) = \det(P(A - \lambda I)P^{-1}) \\ &= \det(P) \cdot \det(A - \lambda I) \cdot \det(P^{-1}) \\ &= \det(A - \lambda I) \end{aligned} \]

Independence between Eigenvectors

  • Theorem

    Suppose \(v_1, v_2, ... , v_r\) are the eigenvectors corresponding to the distinct eigenvalues \(\lambda_1, \lambda_2, ..., \lambda_r\) of an \(n \times n\) matrix \(A\) (\(v_1, v_2, ... , v_r\) are also called eigenvectors from different eigenspaces), then \(v_1, v_2, ..., v_r\) are linearly independent.

  • Proof

    Suppose instead these vectors are dependent. Let \(p\) be the least index such that \(v_{p+1}\) is a linear combination of previous vectors. Then there exists scalars \(c_1, c_2, ..., c_p\) such that \[ \begin{equation} c_1v_1 + c_2v_2 + \cdots + c_pv_p = v_{p+1} \label{lincom} \end{equation} \] Multiply both sides with \(A\) to obtain \[ \begin{equation} c_1\lambda_1v_1 + c_2\lambda_2v_2 + \cdots + c_p\lambda_pv_p = \lambda_{p+1}v_{p+1} \label{eq1} \end{equation} \] Multiply both sides of \(\eqref{lincom}\) with \(\lambda_{p+1}\) to give \[ \begin{equation} c_1\lambda_{p+1}v_1 + c_2\lambda_{p+1}v_2 + \cdots + c_p\lambda_{p+1}v_p = \lambda_{p+1}v_{p+1} \label{eq2} \end{equation} \] Subtract \(\eqref{eq2}\) from \(\eqref{eq1}\) to give \[ \begin{equation} c_1(\lambda_1 - \lambda_{p+1})v_1 + c_2(\lambda_2 - \lambda_{p+1})v_2 + \cdots + c_p(\lambda_p - \lambda_{p+1})v_p = 0 \label{diff} \end{equation} \] Since \(v_1, v_2, ..., v_p\) are independent, the weights in \(\eqref{diff}\) are all zeros. None of \((\lambda_i - \lambda_{p+1})\) is zero, so \(c_i = 0\) for all \(i = 1,...p\). Then \(\eqref{lincom}\) says \(v_{p+1}\) is \(0\), which is impossible.

  • Note

    \(v_1\), as the eigenvector, is nonzero so that the conclusion can hold even for \(p=1\).

Diagonalization

  • Theorem

    An \(n \times n\) matrix \(A\) is diagonalizable if and only if \(A\) has \(n\) linearly independent eigenvectors.

  • Proof

    • Necessity

      Suppose \(A = PDP^{-1}\), then \(AP = PD\). Let \(\newcommand{\p}{\mathrm{p}} P = [\p_1, \p_2, ..., \p_n]\). Since \(D\) is a diagonal matrix, \[ AP = [A\p_1, A\p_2, ..., A\p_n] = [D_{11} \p_1, D_{22} \p_2, ..., D_{nn} \p_n] \] Because \(P\) is invertible, \(\p_i\)’s are linearly independent, which indicates that \(\mathrm p_i\)’s are \(A\)’s \(n\) independent eigenvectors.

      From this, we could also see that if \(A\) is diagonalizable, then \(P\) must be the matrix concatenated with \(A\)’s eigenvectors and \(D\) must be the diagonal matrix filled with corresponding eigenvalues.

    • Sufficiency

      Easy to show.

Not all matrices are diagonalizable. For example, the matrix \(\begin{bmatrix} 1 & 1 \\ 0 & 1 \end{bmatrix}\) is not diagonalizable.

The diagonalization of a square matrix is also referred to as eigen decomposition.

Eigenvalues: Rank, Trace and Determinant

Since the characteristic equation of an \(n \times n\) matrix is a polynomial of degree \(n\), the equation always has exactly \(n\) roots, counting multiplicities, including complex roots. There are some relations between eigenvalues and matrix’s rank, trace and determinant: \[ \begin{gather} \rank = \text{the number of nonzero real eigenvalues, including multiplicities} \\ \tr = \text{sum of the eigenvalues} \\ \det = \text{product of the eigenvalues} \end{gather} \] ## Power Iteration

We may obtain the eigenvalues of an \(n \times n\) diagonalizable matrix \(A\)’s by solving \(\det (A - \lambda I) = 0\). Then the corresponding eigenvectors can be solved. The order of complexity of this method is cubic.

But chances are that we don’t want all the eigenpairs, but instead only those with largest eigenvalues, like when we find the best rank-\(r\) approximation using SVD. It is an overkill to solve all eigenpairs. Luckily there is another lightweight iterative method that can help.

Begin with an arbitrary vector \(x_0 = x \in \R^n\). The iteration rule is \[ x_{t+1} = \frac{A x_t}{||A x_t||} \] Unroll \(x_{t+1}\) to get \(x_t = \frac{A^t x}{||A^t x||}\). Because \(A\) is diagonalizable, \(x\) can be written as a linear combination of \(A\)’s \(n\) normalized independent eigenvectors: \[ x = \sum_{i=1}^n \alpha_i v_i \] Let \(v_1, \dots, v_n\) be arranged such that corresponding eigenvalues go from large to small. Then, \[ \begin{aligned} &A^t x = A^t \sum_{i=1}^N \alpha_i v_i \\ &= \sum_{i=1}^N \alpha_i \lambda_i^t v_i \\ &= \alpha_1 \lambda_1^t \sum_{i=1}^N \frac{\alpha_i}{\alpha_1} (\frac{\lambda_i}{\lambda_1})^t v_i \end{aligned} \] Under the assumption that \(\lambda_1\) is strictly larger than other eigenvalues, \((\frac{\lambda_i}{\lambda_1})^t \to 0, A^t x \to \alpha_1 \lambda_1^t v_1\). Thus, \[ x_{t+1} = \frac{A^t x}{||A^t x||} \to v_1 \] To find the corresponding eigenvalue, observe that \[ x_{t+1}^T A x_{t+1} \to \lambda_1 \] This gives the eigenpair with the largest eigenvalue. To find the next one, repeat the above with \[ A' = A - \lambda_1 v_1 v_1^T \] The process above is called the deflation for the power method. Refer to Finding Eigenvalues.pdf for the proof of correctness. This might also be related to Courant–Fischer theorem.

External Materials

<<<<<<< HEAD Eigen Decomposition.pdf ======= Eigen Decomposition.pdf >>>>>>> notes

Previous
Next