Why is it important for a matrix to be square?

36.1k Views Asked by At

I am currently trying to self-study linear algebra. I've noticed that a lot of the definitions for terms (like eigenvectors, characteristic polynomials, determinants, and so on) require a square matrix instead of just any real-valued matrix. For example, Wolfram has this in its definition of the characteristic polynomial:

The characteristic polynomial is the polynomial left-hand side of the characteristic equation $\det(A - I\lambda) = 0$, where $A$ is a square matrix.

Why must the matrix be square? What happens if the matrix is not square? And why do square matrices come up so frequently in these definitions? Sorry if this is a really simple question, but I feel like I'm missing something fundamental.

7

There are 7 best solutions below

3
On BEST ANSWER

Remember that an $n$-by-$m$ matrix with real-number entries represents a linear map from $\mathbb{R}^m$ to $\mathbb{R}^n$ (or more generally, an $n$-by-$m$ matrix with entries from some field $k$ represents a linear map from $k^m$ to $k^n$). When $m=n$ - that is, when the matrix is square - we're talking about a map from a space to itself.

So really your question amounts to:

Why are maps from a space to itself - as opposed to maps from a space to something else - particularly interesting?

Well, the point is that when I'm looking at a map from a space to itself inputs to and outputs from that map are the same "type" of thing, and so I can meaningfully compare them. So, for example, if $f:\mathbb{R}^4\rightarrow\mathbb{R}^4$ it makes sense to ask when $f(v)$ is parallel to $v$, since $f(v)$ and $v$ lie in the same space; but asking when $g(v)$ is parallel to $v$ for $g:\mathbb{R}^4\rightarrow\mathbb{R}^3$ doesn't make any sense, since $g(v)$ and $v$ are just different types of objects. (This example, by the way, is just saying that eigenvectors/values make sense when the matrix is square, but not when it's not square.)


As another example, let's consider the determinant. The geometric meaning of the determinant is that it measures how much a linear map "expands/shrinks" a unit of (signed) volume - e.g. the map $(x,y,z)\mapsto(-2x,2y,2z)$ takes a unit of volume to $-8$ units of volume, so has determinant $-8$. What's interesting is that this applies to every blob of volume: it doesn't matter whether we look at how the map distorts the usual 1-1-1 cube, or some other random cube.

But what if we try to go from $3$D to $2$D (so we're considering a $2$-by-$3$ matrix) or vice versa? Well, we can try to use the same idea: (proportionally) how much area does a given volume wind up producing? However, we now run into problems:

  • If we go from $3$ to $2$, the "stretching factor" is no longer invariant. Consider the projection map $(x,y,z)\mapsto (x,y)$, and think about what happens when I stretch a bit of volume vertically ...

  • If we go from $2$ to $3$, we're never going to get any volume at all - the starting dimension is just too small! So regardless of what map we're looking at, our "stretching factor" seems to be $0$.

The point is, in the non-square case the "determinant" as naively construed either is ill-defined or is $0$ for stupid reasons.

0
On

In linear algebra matrices usually represent linear transformations between vector spaces. An $m \times n$ matrix $M$ represents a linear transformation from $\mathbb{R}^n$ to $\mathbb{R}^m$

When the matrix is square ($m=n$) it can be thought of as representing a transformation from $\mathbb{R}^n$ to itself. That's when the concepts of eigenvalues and the characteristic polynomial make sense.

1
On

To add to @Ethan's answer what is fundamental when a linear map goes from a vector space to itself is that you can compare a vector with its image:

  • was it rotated?
  • was its length scaled?
  • was it turned into a multiple of itself?

Or a set of vectors with its image:

  • by how much was the volume of this parallelepiped stretched?

All these questions make sense (and carry an awful lot of information on the linear map) only when the domain and codomain coincide. The determinant, characteristic polynomial, etc. are there to answer these questions.

0
On

Lots of good answers already as to why square matrices are so important. But just so you don't think that other matrices are not interesting, they have analogues of the inverse (e.g., the Moore-Penrose inverse) and non-square matrices have a singular-value decompition, where the singular values play a role loosely analogous to the eigenvalues of a square matrix. These topics are often left out of linear algebra courses, but they can be important in numerical methods for statistics and machine learning. But learn the square matrix results before the fancy non-square matrix results, since the former provide a context for the latter.

2
On

One point not made explicit in the answers so far is that powers of a matrix only make sense if the matrix is square.

A square matrix $\mathcal{M}(A)$ represents a linear map $A:V \rightarrow W$ for which the dimensions of its domain $V$ and its codomain $W$ are the same, so its domain and its codomain are isomorphic as vector spaces (often we think of the domain and codomain as being the same vector space, but strictly speaking this is not necessary).

If the domain and codomain of $A$ are the same vector space then we can compose $A$ with itself, creating linear maps $A \circ A$, $A \circ A \circ A$ etc. These are in turn represented by the square matrices $\mathcal{M}^2$, $\mathcal{M}^3$ etc. This allows us to define polynomial functions of a square matrix, and hence to make sense of statements such as the Cayley-Hamilton theorem.

0
On

Well a simple answer that I'm not sure anyone's given yet is just to think of each row of matrix as the coefficients of the unknowns in a set of simultaneous equations.

A square matrix has the same number of equations as unknowns (assuming none of the rows or columns are just multiples of each other) and therefore represents the left-hand side of a set of equations that has a good chance of having a unique solution.

With matrices it's often helpful to think of them geometrically but also often to think about them as equation coefficients.

2
On

For a matrix $\mathbf{A}$ to be invertible it has to be an $n\times n$ square matrix with there existing an $n\times n$ square matrix $\mathbf{B}$ such that, by the operations of matrix multiplication, we have

$$\mathbf {AB} =\mathbf {BA} =\mathbf {I}_{n}\tag{1}$$ where $\mathbf {I}_{n}$ denotes the $n\times n$ identity matrix. This matrix $\mathbf{B}$ is then unique and is termed the inverse of $\mathbf{A}$, denoted by $\mathbf{A}^{-1}$. We see by (1) the matrices have to be square, else they would not commute. A square matrix has no inverse if and only if its determinant is $0$ and is then termed singular.

An important outcome of having invertible square matrices is that a group structure may be imposed on them. The set of all $n\times n$ invertible matrices, over some field $F$, forms a group under matrix multiplication termed the general linear group of degree $n$, or $\text{GL}(n,F)$, these being the set of nonsingular $n\times n$ matrices. It can be seen the product of two invertible matrices is invertible, and that for an invertible matrix $(\mathbf{A}^{-1})^{-1}=\mathbf{A}$.

The special linear group, $\text{SL}(n,F)$, is the subset of $\text{GL}(n,F)$ whose matrices have determinant $1$, and is a normal subgroup of it.

Consider $\text{SL}(n,\mathbb{R})$: due to its matrices having determinant $1$, the linear maps from $\mathbb{R}^n\rightarrow\mathbb{R}^n$ under it are transformed with volume and orientation preserved.