Improve/extend my attempted intuitive explanation for why terms in determinant calculations have alternating signs

415 Views Asked by At

The determinant of a shape defined by points $(a,b)$ and $(c,d)$ as labelled in the gif below is

$\left|\begin{matrix}a&c\\b&d\end{matrix}\right| = ad-bc$

The following process is the result of my trying to intuitively understand why this is, knowing that, for the two dimensional case, the determinant is simply the area.

determinant gif

First calculate the height of the dark blue triangle, denoted $h$. ($\theta$ is the lower left angle of that triangle.)

$\begin{align*} \theta = \arctan\left(\frac{b}{a}\right) \end{align*}$

$\begin{align*} h &= c\tan\theta\\ &=c\frac{b}{a} \end{align*}$

In the final arrangement, the area of the shape is now clearly given by

$\begin{align*} Area &= a(d-h)\\ &= a \left(d - \frac{cb}{a}\right)\\ &= ad - bc \end{align*}$

Was this a reasonable way go about intuitively demonstrating the formula to oneself? Are there any more obvious/simpler ways I overlooked?


How would this (or an alternate intuitive demonstration) be extended to a three dimensional case, where the formula is the following and calculates volume?

$\begin{align*} |A| &= \left| \begin{matrix} a&b&c\\d&e&f\\g&h&i\end{matrix}\right|\\ &=a(ei-hf) - b(di - gf) + c(dh - ge) \end{align*}$

I'm particularly interested in why the second term (prefixed $b$) is negative.

Also, why would the $d$ and $h$ terms be negative, if you expanded along the second row: $|A| = -d(bi-hc) + e(ai-gc) - f(ah-gb)$?


In summary, I'd like to know: Why do the signs in the terms of determinant calculations alternate? Explain in a manner similar to my 'intuitive' attempt, if possible.

2

There are 2 best solutions below

0
On BEST ANSWER

How would this (or an alternate intuitive demonstration) be extended to a three dimensional case, where the formula is the following and calculates volume?

Yes. Just as the determinant of the $2\times 2$ matrix is the area of the parallelogram formed with the two column vectors as sides, so too is the determinant of a $3\times 3$ matrix the area of a parallelepiped with formed with the three column vectors as sides. And analogously for higher dimensional square matrices.

Your animation is an excellent demonstration of how this is so, and a similar construct can be developed for the third dimensional equivalent.

More precisely, the absolute value of the determinants equate to these measures. The signage is an indication of the direction of the angles between one column vector and the next (widdershins or clockwise chirallity).

I'm particularly interested in why the second term (prefixed b) is negative.

It's an artifact of the order you're taught to extract the numbers. That is, take each of the numbers in the first row in turn, place fingers over its row and column, then multiply that numbers by the determinant of the $2\times 2$ matrix you see, and alternatively add and subtract the resulting terms.

$$\begin{align} |A| &= \begin{vmatrix} a&b&c\\d&e&f\\g&h&i\end{vmatrix}\\ &=a(ei-hf) - b(di - gf) + c(dh - ge) \end{align}$$

Instead, imagine the first columns repeat, and multiply each of the three number in the first column by determinant of the two by two matrix just to its lower right, and just add them together.

Or, alternatively, add up the three top-left-to-bottom-right diagonal products and subtract the next three top-right-to-bottom-left diagonal products.

Well, if that's as transparent as mud, can you see what I mean from the following?

$$\begin{align} |A| & = \left.\begin{vmatrix} a&b&c\\d&e&f\\g&h&i\end{vmatrix}\color{gray}{\begin{matrix} a&b&c\\d&e&f\\g&h&i\end{matrix}}\right| \\ &=a(ei-fh) + b(fg - di) + c(dh - eg) \\ &= aei+bfg+cdh-(afh+bdi+ceg) \end{align}$$

1
On

As Matt Samuel noted, this is because we use multilinear forms to measure area. Why multilinear forms? Because this takes into account not just area but orientation. Let's take a little jaunt through what an area function should be and why this involves multilinear algebra, and we'll finish with the general formula for a determinant.


Properties of an area function

Consider an n-dimensional vector space $V$. First, a little bit of philosophy. What do we mean when we say "area" in $V$? We're talking about the area of a parallelopiped spanned by some vectors. What properties should "area" have? It should be a function that:

  • Maps sets of $n$ vectors in $V$ to a number;
  • Changes proportionally to the length of any one of those vectors;
  • Vanishes when those $n$ vectors are linearly dependent.

Let's let $A$ be a candidate "area" function. Property 1 tells us what the domain and range are: $V^n\to\mathbb{R}$. Property 2 tells us that $A$ had better be multilinear. Property 3 is a little trickier, but it tells us that $A$ is going to be alternating. To wit: \begin{align*} 0 &= A(v_1,\ldots,v+w,v+w,\ldots,v_n)\\ &= A(v_1,\ldots,v,w,\ldots,v_n)+A(v_1,\ldots,w,v,\ldots,v_n) \end{align*} that is, $$ A(v_1,\ldots,v,w,\ldots,v_n) = -A(v_1,\ldots,w,v,\ldots,v_n). $$ This means if we transpose two neighboring arguments, linearity and the "lower dimensions have zero area" property force $A$ to pick up a minus sign. An area form, $A$, then, measures not just the area of a basis but its signed area.

Such objects are called "alternating multilinear forms."


Multilinear algebra

There's a well-developed general framework for this: "Multilinear algebra." Any advanced linear algebra textbook should cover it, I'm pretty sure it's in Dummit & Foote, it's covered in the preface of pretty much any Riemannian geometry text, and in particular Hubbard and Hubbard, Vector Calculus, Differential Forms, and Multivariable Calculus is a good reference tying it to undergraduate linear algebra and multivariable calculus.

The point of this framework is to abstract so that matrix notation doesn't interfere so much.(If you think the formula for a 3x3 determinant is bad, try writing down an $n\times n$ one![*]) Associated to $V$ is its external algebra of multilinear alternating forms, $\oplus_k\Lambda^kV$. An element in $\Lambda^kV$ is defined to be a linear map $\eta:V^k\to\Bbb R$ such that if two adjacent entries in its arguments are transposed, the value changes sign. As an exercise, you should prove that $\Lambda^kV$ is a vector space of dimension $\binom{n}{k}$.

In particular consider the top-dimensional forms, $\Lambda^nV$. These are all the different ways of taking in oriented bases of $V$ and a number. As we discussed above, these are the ways of consistently assigning area to the parallelopipeds of $V$. There's a subtlety, though -- there are many ways to put a 1-dimensional vector space in correspondence with $\Bbb R$. That means there are many different possible choices of area form on $V$, with no "correct" one unless we make another choice.

However, if $M:V\to V$ is an isomorphism, a little piece of magic happens. $M$ induces a linear map $\Lambda^nV\to \Lambda^nV$ by pullback, and as map of one-dimensional vector spaces it is simply multiplication by a number. That number is the determinant of $M$.

Why does $\det M$ measure distortion in area? Well, once we choose a generator of $\Lambda^nV$, we have declared that to be the area form, so the oriented area spanned by any basis is defined to be what this particular form spits out. The pullback of this form is defined by what comes out when you put in the image under $A$ of a given basis. This is $\det(M)$ times the original area form. In other words, the area spanned by the image of a basis is $\det(M)$ times the area spanned by the basis.


The formula

What does this have to do with minus signs in the determinant formula? When we consider matrices, we have already chosen an oriented basis of $V$. Let this basis be $\{e_1,e_2,\ldots,e_n\}$. Since we have an oriented basis, let's let our area form be such that $A(e_1,e_2,\ldots,e_n) = 1$.

If $M:V\to V$, the matrix representing $M$ has entries defined by $Me_i = \sum M_{ij}e_j$. The determinant of $M$ is defined as \begin{align*} \det M &= A( M(e_1,e_2,\ldots,e_n) )\\ &= A\bigg( \sum_j M_{1j}e_j, \sum_j M_{2j}e_j,\ldots, \sum_j M_{nj}e_j\bigg) \end{align*} Since $A$ is multilinear, we can pull all the sums and coefficients out: $$ \ldots = \sum_{j_1,\ldots,j_n} M_{1j_1}M_{2j_2}M_{3j_3}\cdots M_{nj_n} A(e_{j_1},e_{j_2},\ldots,e_{j_n}) $$ The coup de grace: since $A$ is alternating and the ordered basis $(e_1,\ldots,e_n)$ has area $1$, when we evaluate $A$ on a permutation of $(e_1,\ldots,e_n)$, the result is $(-1)^\mbox{sign of permutation}$.

So: $$ \det M = \sum_{j_1,\ldots,j_n} (-1)^{\operatorname{sgn}(j_1,j_2,\ldots,j_n)}M_{1j_1}M_{2j_2}\cdots M_{nj_n}. $$


[*] Actually it's not so bad if you realize that the formula right above this is something like, "multiply all the permutations of entries in the different columns with the appropriate sign from the permutation," but realizing that comes from adopting this abstract perspective.