Determinant
Derivation of Determinant
Determinant may be the most infamous concept in linear algebra, in terms of its odd definition and computation. Sometimes, we may wonder why there has to be a determinant.
So instead of giving its definition directly, we first lay down some properties we expect the determinant to have. Then we try to construct the determinant from the ground up and prove its existence and uniqueness.
Determinant in essence captures the volume of the parallelepiped formed by vectors of an \(n \times n\) matrix. When \(n=1\), it is the length; when \(n=2\), it is the area of parallelogram; when \(n=3\), it is the volume of parallelepiped. What about when \(n\) goes larger?
As said, there are some basic properties that we expect the determinant (the volume) to have, and that indeed hold for cases \(n = 1,2,3\). In the following, let \(A\) be an \(n \times n\) matrix and denote \(\det(\v_1, \dots, \v_n)\) as the determinant of the matrix formed by a system of vectors \(\v_1, \dots, \v_n\).
Basic properties
Linearity in each argument: \[ \det(\v_1, \dots, \underbrace{\alpha \v_k + \beta \u_k}_k, \dots, \v_n) = \alpha \det(\v_1, \dots, \underset{k}{\v_k}, \dots, \v_n) + \beta \det(\v_1, \dots, \underset{k}{\u_k}, \dots, \v_n) \]
Anti-symmetry: \[ \det(\v_1, \dots, \underset{j}{\v_j}, \dots, \underset{k}{\v_k}, \dots, \v_n) = -\det(\v_1, \dots, \underset{j}{\v_k}, \dots, \underset{k}{\v_j}, \dots, \v_n) \]
Normalization: \[ \det(I) = 1 \]
With these basic properties, we can derive some advanced properties for determinant.
Derived properties
- Preservation under column replacement: \[ \det(\v_1, \dots, \underbrace{\v_j + \alpha \v_k}_{j}, \dots, \underset{k}{\v_k}, \dots, \v_n) = \det(\v_1, \dots, \underset{j}{\v_j}, \dots, \underset{k}{\v_k}, \dots, \v_n) \]
Zero determinants
If \(A\) has a zero column, then \(\det A = 0\).
If \(A\) has two equal columns, then \(\det A = 0\).
If one column of \(A\) is the multiple of another, then \(\det A = 0\).
If columns of \(A\) are linearly dependent, then \(\det A = 0\).
Diagonal matrices and triangular matrices
- Determinant of a diagonal matrix equal the product of the diagonal entries.
- Determinant of a triangular matrix equal the product of the diagonal entries.
Transpose and product
\(\det A^T = \det A\).
This property implies that, all the statements above about columns, can be applied to rows. Thus, to compute the determinant of a matrix, we can apply row operations to transform it to reduced row echelon form first, and then obtain the result by computing the product of the diagonal entries.
\(\det (AB) = (\det A)(\det B)\).
Construction
Now with these properties on hand, how can we find the definition of the determinant and how can we know that the definition is unique over these properties?
Consider an \(n \times n\) matrix \(A = \{ a_{jk} \}_{j,k=1}^n\). Let \(\v_1, \dots, \v_n\) be its columns and \(\newcommand{\e}{\mathrm{e}} \e_1, \dots, \e_n\) be the unit vectors. We have \[ \v_k = a_{1, k} \e_1 + a_{2, k} \e_2 + \dots + a_{n, k} \e_{n} = \sum_{j=1}^n a_{j, k} \e_j \] By the linearity in each argument, expand the first column to give \[ \det(\v_1, \dots, \v_n) = \det(\sum_{j=1}^n a_{j, 1} \e_i, \v_2, \dots, \v_n) = \sum_{j=1}^n a_{j, 1} \det(\e_j, \v_2, \dots, \v_n) \] Further expand the second column to give \[ \begin{aligned} &\det(\v_1, \dots, \v_n) = \sum_{j_1=1}^n a_{j_1, 1} \det(\e_{j_1}, \v_2, \dots, \v_n) \\ &= \sum_{j_1=1}^n a_{j_1, 1} \det(\e_{j_1}, \sum_{j_2=1}^n a_{j_2, 2} e_{j_2}, \v_3, \dots, \v_n) \\ &= \sum_{j_1=1}^n a_{j_1, 1} \sum_{j_2=1}^n a_{j_2, 2} \det(\e_{j_1}, e_{j_2}, \v_3, \dots, \v_n) \\ &= \sum_{j_1=1}^n \sum_{j_2=1}^n a_{j_1, 1} a_{j_2, 2} \det(\e_{j_1}, e_{j_2}, \v_3, \dots, \v_n) \end{aligned} \] Expand the remaining columns to give \[ \det(\v_1, \dots, \v_n) = \sum_{j_1=1}^n \sum_{j_2=1}^n \dots \sum_{j_n=1}^n a_{j_1, 1} a_{j_2, 2} \dots a_{j_n, n} \det(\e_{j_1}, e_{j_2}, \dots, \e_{j_n}) \] This yields \(n^n\) terms! But luckily, many terms are zero, as long as any two of \(j_1, \dots, j_n\) coincides. To eliminate zero terms, consider the permutation of \(\{ 1, \dots, n \}\). If \(j_1, \dots, j_n\) are chosen to be a permutation, \(\det(\e_{j_1}, e_{j_2}, \dots, \e_{j_n})\) is nonzero. We may use a function \(\sigma: \{ 1, \dots, n \} \to \{ 1, \dots, n \}\) to denote a permutation. Let the set of all permutations of set \(\{ 1, \dots n \}\) be \(\mathrm{Perm}(n)\). Then, \[ \det(\v_1, \dots, \v_n) = \sum_{\sigma \in \mathrm{Perm}(n)} a_{\sigma(1), 1} a_{\sigma(2), 2} \dots a_{\sigma(n), n} \det(e_{\sigma(1)}, e_{\sigma(2)}, \dots, e_{\sigma(n)}) \] The matrix with columns \(e_{\sigma(1)}, e_{\sigma(2)}, \dots, e_{\sigma(n)}\) can be obtained from identity matrix by finitely many column exchanges. So \(\det(e_{\sigma(1)}, e_{\sigma(2)}, \dots, e_{\sigma(n)})\) is \(1\) or \(-1\) depending on the number of column exchanges. We informally define the sign of function \(\sigma\) to be \(1\) if an even number of column exchanges are needed to permute \(1, \dots, n\) to \(\sigma(1), \dots, \sigma(n)\); and the signa of \(\sigma\) is \(-1\) if the number of exchanges is odd.
The necessary condition of the definition of determinant requires us define it like \[ \det A \coloneq \sum_{\sigma \in \mathrm{Perm}(n)} a_{\sigma(1), 1} a_{\sigma(2), 2} \dots a_{\sigma(n), n} \mathrm{sign}(\sigma) \] If we define it in this way, we can verify that it indeed satisfies the basic properties, concluding the construction of determinant.
Cofactor Expansion
For an \(n \times n\) matrix \(A\), let \(A_{j,k}\) denote \((n-1) \times (n-1)\) matrix obtained by crossing out the row \(j\) and column \(k\) of matrix \(A\). The determinant of \(A\) can be expanded in the row number \(k\) as \[ \det A = a_{j, 1} (-1)^{j+1} \det A_{j, 1} + a_{j, 2} (-1)^{j+2} \det A_{j, 2} + \dots + a_{j, n} (-1)^{j+n} \det A_{j, n} \] The numbers \(C_{j,k} = (-1)^{j+k} \det A_{j,k}\) are called the cofactors of matrix \(A\). The matrix \(C = \{ C_{j,k} \}_{j=1,k=1}^n\) is called the cofactor matrix of \(A\). Suppose \(A\) is an invertible matrix, then \[ A^{-1} = \frac{1}{\det A} C^T \] \(C^T\) is sometimes denoted as \(A^*\), called the adjugate matrix of \(A\).
Cramer’s Rule
For an invertible matrix \(A\) and an equation \(Ax = b\), there is a unique solution that \[ x = A^{-1} b = \frac{1}{\det A} C^T b \] \(x_k\) is obtained by multiplying the \(k\)-th column of the cofactor matrix with \(b\), which is equivalent to the determinant of the matrix obtained by replacing \(k\)-th column of \(A\) with the vector \(b\). This property is known as Cramer’s rule:
For an invertible matrix \(A\), the entry \(k\) of the solution of the equation \(Ax = b\) is given by the formula \[ x_k = \frac{\det B_k}{\det A} \] where \(B_k\) is obtained by replacing \(k\)-th column of \(A\) with the vector \(b\).
Algebraic Properties
In blockwise matrix multiplication, we have \[ \begin{aligned} \det \begin{pmatrix} A & 0 \\ C & D \end{pmatrix} &= \det \left( \begin{pmatrix} A & 0 \\ 0 & I \end{pmatrix} \begin{pmatrix} I & B \\ C & I \end{pmatrix} \begin{pmatrix} I & 0 \\ 0 & D \end{pmatrix} \right) \\ \\ &= \det(A) \det(D) \\ \\ \det \begin{pmatrix} A & B \\ 0 & D \end{pmatrix} &= \det \left( \begin{pmatrix} I & 0 \\ 0 & D \end{pmatrix} \begin{pmatrix} I & B \\ 0 & I \end{pmatrix} \begin{pmatrix} A & 0 \\ 0 & I \end{pmatrix} \right) \end{aligned} \] If \(A\) is invertible, \[ \begin{aligned} \det \begin{pmatrix} A & B \\ C & D \end{pmatrix} &= \det \left( \begin{pmatrix} I & 0 \\ C A^{-1} & I \end{pmatrix} \begin{pmatrix} A & 0 \\ 0 & D - C A^{-1} B \end{pmatrix} \begin{pmatrix} I & A^{-1} B \\ 0 & I \end{pmatrix} \right) \\ &= 1 \cdot \det A \det (D - C A^{-1} B) \cdot 1 \\ &= \det A \det (D - C A^{-1} B) \end{aligned} \]