Why is so special about $n$ vectors passed through an $\Bbb{R}^n \to \Bbb{R}^n$ dimensional linear map

164 Views Asked by At

I take a linear map that takes vectors, $v \in \Bbb{R}^n$ and transforms them into other vectors, $u \in \Bbb{R}^n$. If we fix a basis, this is of course expressed as an $n \times n$ matrix.

Consider taking a single vector, $v$ that has length, $|v|$. The linear map transforms it to another vector, $u$ that has some other length, $|u|$. What is the ratio, $\frac{|u|}{|v|}$? It depends on the vector, $v$ chosen. If we choose along the eigen vector with the largest eigen value, this ratio will be the largest. So we can't use this ratio as a property of the linear map alone since it also depends on the choice of the original vector, $v$.

Now, let's choose two vectors, $v_1$ and $v_2$. They are mapped respectively to the vectors $u_1$ and $u_2$. Let's define the area of the parallelogram formed by these two vectors $|v_1, v_2|$. The ratio $\frac{|u_1, u_2|}{|v_1,v_2|}$ will also depend on $v_1$ and $v_2$ and won't be a property of the linear map alone.

We keep going in this way until we get $n$ vectors, $v_1, v_2 \dots v_n$. And now something magical happens. Suddenly, $\frac{|u_1, u_2 \dots u_n|}{|v_1, v_2 \dots v_n|}$ is a function only of the linear map and doesn't depend on the vectors $v_1, v_2 \dots v_n$. This ratio can therefore be used as a property strictly of the linear map and is of course, the determinant.

My question - what changed at $n$ vectors that things suddenly "snapped" into place? And can we prove that only with $n$ vectors will we get the behavior above and never with $n-k$ vectors (in general) with $k \in [1,2, \dots n-1]$?

1

There are 1 best solutions below

0
On

We can consider the behavior of an $n\times n$ matrix $A$ on the unit $n$-cube spanned by some basis vectors $b_1,...,b_n$. For the sake of clarity, let's assume that this is an orthonormal basis (which justifies calling their parallelotope a unit $n$-cube), then later we can apply the same reasoning to the general case. The matrix takes this unit $n$-cube and sends it to some parallelotope defined using the vectors $A(b_1),...,A(b_n)$ in the codomain. Because the basis $b_1,...,b_n$ is linearly independent, the matrix has independent choice for what length and direction each $A(b_i)$ will have -- both essential things to determine the $n$-measure of their parallelotope. The idea is that, if the lengths and directions amongst the $A(b_i)$s are sufficiently different, access to all of these lengths and directions is necessary to compute an invariant notion of measure scaling from the domain to the codomain. If we take a lower-dimensional plane through the parallelotope, the fact that different $A(b_i)$s were able to be given different lengths and directions will determine different lower-dimensional scaling factors from the domain to the codomain, depending on what linear combination of the $A(b_i)$s generates that subplane. The fact that the subplane is of lower dimension guarantees some asymmetry between $A(b_i)$s in the linear combination that generates it, and their differing lengths and directions make this asymmetry result in differing scaling factors depending on the particular subplane and generating linear combination.

Here's a less abstract example: take the basis $b_1=\begin{bmatrix}1\\0\\0\end{bmatrix},b_2=\begin{bmatrix}0\\1\\0\end{bmatrix},b_3=\begin{bmatrix}0\\0\\1\end{bmatrix}$ and the matrix $A=\begin{bmatrix} 2 & 0 & 0\\ 0 & 3 & 0\\ 0 & 0 & 5 \end{bmatrix}$ so that $A(b_1)=\begin{bmatrix}2\\0\\0\end{bmatrix}, A(b_2)=\begin{bmatrix}0\\3\\0\end{bmatrix}, A(b_3)=\begin{bmatrix}0\\0\\5\end{bmatrix}$. Because $A(b_1),A(b_2),$ and $A(b_3)$ are asymmetric to one another, we can mix and match them in $2$ dimensional subplanes to get different domain-to-codomain scalings. The scaling factor from the $x,y$-plane to its image is $|A(b_1)||A(b_2)|=6$ (note that the fact that the images are orthogonal allows for this simple of a computation); The scaling factor from the $y,z$-plane to its image is $|A(b_2)||A(b_3)|=15$; The scaling factor from the $x,z$-plane to its image is $|A(b_1)||A(b_3)|=10$. One way to reason that taking all 3 dimensions "snaps into place" an invariant scaling is that there's no option for this "mix and match" process when you're already considering all dimensions.

So that's the case where we map an orthonormal basis into the codomain and consider what happens -- what about for an arbitrary basis? Or even for an arbitrary subset set $S$ of the domain, that isn't even a linear subspace (like the unit ball)? One way I think of a transformation $A$ as being linear is that it acts "uniformly" on different pockets of the domain, so longs as all $n$ dimensions are being represented in the pocket (for the same "mix and match" avoidance as above). Roughly, we can first notice that linearity makes $A$ scale $n$-cubes in the same way it did on the orthonormal unit $n$-cube given by the basis above, regardless of the cube's side length or position in the space. We can then fill any $n$-dimensional set $S$ in the domain with $n$-cubes of differing sizes and get by some limiting argument that $S$ is scaled by the same factor as all of the cubes. Notice that $S$ needed to be $n$-dimensional so that the $n$-cubes could fit inside to approximate it, lest we have another "mix and match" issue when filling it with cubes of some dimension smaller than $n$. This also shows that if $S$ has any parts with dimension smaller than $n$, we cannot hope in general for an invariant scaling for $S$ when compared to some other $S'$ with dimension smaller than $n$, because we could hit another "mix and match" type issue.