The definition of Determinant in the spirit of algebra and geometry

2k Views Asked by At

The concept of determinant is quite unmotivational topic to introduce. Textbooks use such "strung out" introductions like axiomatic definition, Laplace expansion, Leibniz'a permutation formula or something like signed volume.

Question: is the following a possible way to introduce the determinant?


Determinant is all about determing whether a given set of vectors are linearly independent, and a direct way to check this is to add scalar multiplications of column vectors to get the diagonal form:

$$\begin{pmatrix} a_{11} & a_{12} & a_{13} & a_{14} \\ a_{21} & a_{22} & a_{23} & a_{24} \\ a_{31} & a_{32} & a_{33} & a_{34} \\ a_{41} & a_{42} & a_{43} & a_{44} \\ \end{pmatrix} \thicksim \begin{pmatrix} d_1 & 0 & 0 & 0 \\ 0 & d_2 & 0 & 0 \\ 0 & 0 & d_3 & 0 \\ 0 & 0 & 0 & d_4 \\ \end{pmatrix}.$$

During the diagonalization process we demand that the information, i.e. the determinant, remains unchanged. Now it's clear that the vectors are linearly independent if every $d_i$ is nonzero, i.e. $\prod_{i=1}^n d_i\neq0$. It may also be the case that two columns are equal and there is no diagonal form, so we must add a condition that annihilates the determinant (this is consistent with $\prod_{i=1}^n d_i=0$), since column vectors can't be linearly independent.

If we want to have a real valued function that provides this information, then we simply introduce an ad hoc function $\det:\mathbb{R}^{n \times n} \rightarrow \mathbb{R}$ with following properties:

  1. $$\det (a_1,\ldots,a_i,\ldots,a_j,\ldots,a_n)=\det (a_1,\ldots,a_i,\ldots,k\cdot a_i+a_j,\ldots,a_n).$$

  2. $$\det(d_1\cdot e_1,\ldots,d_n\cdot e_n)=\prod_{i=1}^n d_i.$$

  3. $$\det (a_1,\ldots,a_i,\ldots,a_j,\ldots,a_n)=0, \space \space \text{if} \space \space a_i=a_j.$$


From the previous definition of determinant we can infer the multilinearity property:

$$[a_1,\ldots,c_1 \cdot u+c_2 \cdot v,\ldots,a_n]\thicksim diag[d_1,\ldots,c_1 \cdot d'_i+c_2 \cdot d''_i ,\ldots,d_n],$$ so $$\det[a_1,\ldots,c_1 \cdot u+c_2 \cdot v,\ldots,a_n]=\prod_{j=1:j\neq i}^n d_j(c_1 \cdot d'_i+c_2 \cdot d''_i)$$ $$=c_1\det(diag[d_1,\ldots, d'_i,\ldots,d_n])+c_2\det(diag[d_1,\ldots, d''_i,\ldots,d_n])$$ $$=c_1\det[a_1,\ldots,u,\ldots,a_n]+c_2\det[a_1,\ldots, v,\ldots,a_n].$$

Note that previous multilinearity together with property $(1)$ gives the property $(2)$, so we know from the literature that the determinant function $\det:\mathbb{R}^{n \times n} \rightarrow \mathbb{R}$ actually exists and it is unique.


Obviously, the determinant offers information how orthogonal a set of vectors is. Thus, with Gram-Schmidt process we can form an orthogonal set of vectors form set $(a_1,\ldots, a_n)$, and by multilinearity and property $(2)$ the absolute value of determinant is the volume of parallelepiped spanned by the set of vectors.

Definition. Volume of parallelepiped formed by set of vectors $(a_1,\ldots, a_n)$ is $Vol(a_1,\ldots, a_n)=Vol(a_1,\ldots, a_{n-1})\cdot |a_{n}^{\bot}|=|a_{1}^{\bot}|\cdots |a_{n}^{\bot}|$, where $a_{i}^{\bot} \bot span(a_1,\ldots, a_{i-1}).$


This approach to determinant works equally well if we begin with the volume of a parallelepiped (geometric approach) or with the search of invertibility (algebraic approach). I was motivated by the book Linear algebra and its applications by Lax on chapter 5:

Rather than start with a formula for the determinant, we shall deduce it from the properties forced on it by the geometric properties of signed volume. This approach to determinants is due to E. Artin.

  1. $\det (a_1,\ldots,a_n)=0$, if $a_i=a_j$, $i\neq j.$
  2. $\det (a_1,\ldots,a_n)$ is a multilinear function of its arguments, in the sense that if all $a_i, i \neq j$ are fixed, $\det$ is a linear function of the remaining argument $a_j.$
  3. $\det(e_1,\ldots,e_n)=1.$
5

There are 5 best solutions below

1
On

A rationale i could give to the determinant function is to generalise the property of the power, i.e we know that for any scalars $x,a,b,c$ it holds: $x^{a + b + c \ldots} = x^a x^b x^c\ldots$

But, lets assume that $a,b,c\ldots$ are actually the diagonal entries of a matrix $\mathbf{A}$ then it holds that $x^{trace(\mathbf{A})} = det(x^\mathbf{A})$ , with $x^A$ the matrix exponential (of base $x$);

6
On

$\det$ is the only multilinear alternating linear map such that $\det I = 1$. (2) and (3) combined with the following property would define $\det$ uniquely. $$ \det(a_1,\dots,u,\dots,a_n) + \det(a_1,\dots,\lambda v,\dots,a_n) = \det(a_1,\dots,u + \lambda v,\dots,a_n) $$ The only thing missing from your definition is linearity.

6
On

That seems quite opaque: It's a way of computing a quantity rather than telling what exactly it is or even motivating it. It also leaves completely open the question of why such a function exists and is well-defined. The properties you give are sufficient if you're trying to put a matrix in upper-triangular form, but what about other computations? It also gives no justification for one of the most important properties of the determinant, that $\det(ab) = \det a \det b$.

I think the best way to define the determinant is to introduce the wedge product $\Lambda^* V$ of a finite-dimensional space $V$. Given that, any map $f:V \to V$ induces a map $\bar{f}:\Lambda^n V \to \Lambda^n V$, where $n = \dim V$. But $\Lambda^n V$ is a $1$-dimensional space, so $\bar{f}$ is just multiplication by a scalar (independent of a choice of basis); that scalar is by definition exactly $\det f$. Then, for example, we get the condition that $\det f\not = 0$ iff $f$ is an isomorphism for free: For a basis $v_1, \dots, v_n$ of $V$, we have $\det f\not = 0$ iff $f(v_1\wedge \cdots \wedge v_n) = f(v_1) \wedge \cdots \wedge f(v_n) \not = 0$; that is, iff the $f(v_i)$ are linearly independent. Furthermore, since $h = fg$ has $\bar{h} = \bar{f}\bar{g}$, we have $\det(fg) = \det f \det g$. The other properties follow similarly. It requires a bit more sophistication than is usually assumed in a linear algebra class, but it's the first construction of $\det$ I've seen that's motivated and transparently explains what's otherwise a list of arbitrary properties.

0
On

The geometric meaning of the determinant of, say, a 3 by 3 matrix is the (signed) volume of the parallelepiped spanned by the three column vectors (alternatively the three row vectors). This generalizes the (signed) area of the parallelogram spanned by the two column vectors of a 2 by 2 matrix.

At the next stage, to pursue the geometric definition you would have to clarify the meaning of "signed" above. The naive definition of volume is always positive whereas the determinant could be negative, so there is some explaining to do in terms of orientations.

The route most often chosen both by instructors and textbook writers is the algebraic one where one can write down a magic formula and, boom! the determinant has been defined. This is fine if you want to get through a certain amount of material required by the course, but pedagogically this may not be the best approach.

Ultimately a combination of the geometry and the algebra is required to explain this concept properly. It connects to more advanced topics like exterior algebras but that's the next stage already.

0
On

The way I teach determinants to my students is to start with the case $n=2$, and to use the complex numbers and/or trigonometry in order to show that, for $(a,b), (c,d)$ vectors on the plane, the quantity $$ad-bc=||(a,b)||\cdotp ||(c,d)|| \sin \theta$$ is the signed area between $(a,b)$ and $(c,d)$ (in this order).

Then, using the vector product and its properties (we have seen it before coming to the topic of determinants in full generality), we check that $3$ by $3$ determinants carry the meaning of signed volumes.

The next step is to introduce determinants as alternate multilinear functions. We have seen examples of bilinear maps (inner products), trilinear maps, such as $$(u,v,w)\mapsto (u\times v)\bullet w$$ and the quadrilinear maps $$(a,b,c,d)\mapsto (a\bullet c) (b\bullet d)-(b\bullet c) (a\bullet d),$$ $$(a,b,c,d)\mapsto (a \times b)\bullet (c\times d).$$

Now, when explaining multilinearity we did emphasise that the fact that the last two examples are equal can be proven if only we check equality for the case where $a,b,c,d$ are vectors of the canonical basis.

Then the time comes to define the determinant of $n$ vectors in $\mathbb{R}^n$, which is a new example of $n-$linear, alternate function. They check that the vector space of such maps is indeed $\left(^n_n\right).$ The students thus learn that the determinant is essentially the only possible such function, up to a multiple, in the same way they saw that more general multilinear maps depend exclusively on their values on vectors of a chosen basis (say, the canonical basis in our case).

Although I learnt to prove stuff such as $\det(AB)=\det(A) \det(B)$ by strict functoriality, in class we do define the map $$L(X_1, \ldots , X_n)=\det(AX_1, \ldots , AX_n),$$ which by uniqueness is a constant multiple of the determinant function $T(X_1, \ldots , X_n)=\det(X_1, \ldots, X_n),$ and compute the constant by evaluating on the identity matrix, i.e. $X_i=e_i.$

Thus $\det(AB)=\det(A)\det(B).$

It is in T.W. Korner's book called Vectors, Pure and Applied that one can see a construction that uses elementary matrices and is rigorous. The OP can check Korner's book to see a nice, slightly more down-to-earth exposition.

In op. cit. one can see how Korner uses the fact that an invertible matrix can be decomposed as a product of elementary matrices to obtain the formula $\det(AB)=\det(A)\det(B).$

Note: I have been deliberately brief in my exposition, just so as not to repeat too much stuff that was already included in other answers.