Where does the definition of the adjugate matrix come from?

298 Views Asked by At

When inverting a matrix, we can consider the inverse of A to be equivalent to the inverse of the determinant of A multiplied by the adjugate matrix of A. Or rather, that the product of the adjugate matrix of A with A is defined to be equal to det(A) * I.

This is considered a "definition", but as of yet I cannot find any explanation of why this should be the case, why this works, or even who invented the concept of an adjugate matrix.

Clearly this "definition" must be "correct", in the sense that I could "define" the product of the "adjugate" of A with A to be anything (say, ...3), yet this "definition" is clearly "wrong".

Looking in more detail, we see that the adjugate matrix of A is the transpose of the cofactor matrix of A. Why should this be the case? I cannot find any information on why cofactors enable matrix inversion. Again this is given as a "definition":

$\mathbf A^{-1} = \frac{1}{\operatorname{det}(\mathbf A)} \mathbf C^\mathsf{T}.$

And hence apparently just assumed to be true, and I can find no information on why this works.

My core questions are really:

  • What actually is an adjugate matrix?
  • How does it allow for matrix inversion? There must be something special about it.
  • Or did somebody just figure out having a matrix with determinants on the diagonal was useful? If so - how did they figure this out? What is the logical thought-process to reach this conclusion?
  • How does the machinery of cofactors play into this? Why is the adjugate the transpose of the matrix cofactors? How does this help?

Or to summarise: why does this all "magically" seem to work, anyway?

2

There are 2 best solutions below

2
On

One way to think about this is using exterior powers. If $A : V \to V$ and $V$ is $d$-dimensional, then you get an induced map $C = \wedge^{d-1} A : \wedge^{d-1} V \to \wedge^{d-1} V$. If you fix an identification of $\wedge^d V \to k$, the base field, then $\wedge : V \otimes\wedge^{d-1} V \to \wedge^d V \to k$ is a perfect pairing, and if you correspondingly choose dual bases then the matrix of $\wedge^{d-1} A$ is the adjugate of the matrix of $A$. The formula $C^T A = (\det A) I$ is a consequence of the above pairing and the identification $\wedge^d A = \det A$ induced by $\wedge^d V \cong k$.

A bit more explicitly, if $V$ has basis $e_1, \dots, e_d$ then $\wedge^k V$ has basis $e_I = e_{i_1} \wedge \dots \wedge e_{i_k}$ for $I = (i_1, \dots, i_k)$ with $1 \le i_1 < \dots < i_k \le d$. The induced map $\wedge^k A$ takes $e_I$ to $\sum a_{I, J} e_J$ where $a_{I, J}$ is the determinant of the submatrix with columns indexed by $I$ and rows indexed by $J$. In particular, $\wedge^d V$ is $1$-dimensional, generated by $e_1 \wedge \dots \wedge e_d$ and $\wedge^d A$ is given by (multiplication by) $\det A$. To understand the adjugate, note that $\wedge^{d-1} V$ has basis $$e_1 \wedge \dots \wedge \hat{e_j} \wedge \dots \wedge e_d,$$ where $\hat{e_j}$ signifies that that factor is omitted. Up to some signs this is the dual basis of $e_i$ as mentioned above, explicitly $$(e_i) \wedge (e_1 \wedge \dots \wedge \hat{e_j} \wedge \dots \wedge e_d) = (-1)^{i-1} \delta_{i, j} \, (e_1 \wedge \dots \wedge e_d) .$$ Once these signs are incorporated, the general description of the matrix for $\wedge^k A$ specializes to the adjugate matrix when $k = d - 1$.

This is surely not how anyone came up with either the idea or the explicit formula for the adjugate matrix but for me it explains the "magic trick".

6
On

I wish I could fully understand Ronno’s answer; one day. A more concrete but less fantastic explanation is this: (and more viable for people to find back in the day)

I too struggled to see why this worked, why $(\det A)^{-1}C^t$ should be the inverse, until I came up with this observation. See, in trying to find an inverse, we are basically trying to find a matrix that acts like a Kronecker $\delta$. We need a way of ‘detecting’ when we get a $j$th row - dot - $j$th column in the matrix product and when we get an $i$th row - dot - $j\neq i$th column. The idea is to use determinants, specifically, we want the products for the diagonal entries to compute the determinant of a known matrix and all other products to compute the determinant of a singular matrix - which will give zero.

The cofactor expansion is a marvellous formula, and allows this to work very elegantly. Consider what happens when you calculate a diagonal entry in $AC^t$, say at position $(j,j)$. You have: $$\begin{align}\sum_{k=1}^n (A)_{j,k}(C^t)_{k,j}&=\sum_{k=1}^n a_{j,k}(C)_{j,k}\\&=\sum_{k=1}^n(-1)^{j+k}a_{j,k}\det M_{j,k}\\&=\det A\end{align}$$

Where $M$ is the minor $(n-1)$-square matrix, by the cofactor expansion formula and definition of $C$. Great! What happens when you compute a nondiagonal entry of $AC^t$ is a little harder to visualise.

Recall the $(i,j)$th minor matrix is visually obtained by blotting out row $i$ and column $j$. This contains no information about row $i$ or column $j$. So you can change the elements in those rows or columns without changing the minor. That is, if you alter the values of ‘$a$’ in: $$\sum_{k=1}^n(-1)^{k+j}a_{j,k}\det M_{j,k}$$You will compute the determinant of the matrix obtained by changing the entries $j$th column. For the minors down the $j$th of this new matrix will be exactly the same and the above will just be the cofactor expansion for this new matrix.

In particular, calculate the $(i,j)$ entry ($i\neq j$) of $AC^t$: $$\sum_{k=1}^n(-1)^{k+j}a_{i,k}\det M_{j,k}$$

We can view this as the cofactor expansion for the altered matrix $B$: $$\sum_{k=1}^n(-1)^{k+j}b_{j,k}\det M_{j,k}$$Whose $(j,k)$th minors are exactly the same as $A$’s and the only alteration is to set the $j$th row of $B$ to be the $i$th row of $A$. But wait! That means this equals $\det B$ where $B$ has two identical rows (namely, the $i$th row is the same as the $i$th row of $A$, as $B$ is copied over from $A$, and the $j$th row is defined to equal this $i$th row…) so $B$ is singular and $\det B=0$. That is, $(AC^t)_{i,j}=0$.

The matrix product is calculating determinants of altered matrices and by the magic of the cofactor expansion we can arrange these altered matrices to be singular and non-singular in exactly the way we want. $(AC^t)_{i,j}=\delta_{i,j}$ by this machinery, and similarly for $C^tA$ since the cofactor expansion can be performed down columns as well as along rows.

I think the core concept of this is that you can change the ‘$a$’ values without changing the minor matrices, thereby computing a determinant of a new matrix. The core mathematical ingredient that makes it work is this cofactor expansion identity. My guess is that someone noticed the minors in the cofactor formula are ignorant of their accompanying coefficients and so they decided to play around with changing the values. That was my thought process anyway, though reverse-engineering is different to creating.