Degrees of freedom in an equation

3.9k Views Asked by At

In a linear system of equations $$\left. \begin{matrix} F(x,y,z) = 0 \\ G(x,y,z) = 0 \\ H(x,y,z) = 0 \end{matrix} \right.$$ we often say that each equation reduces the degrees of freedom by 1, or that the dimension of (the output?) decreases by 1. From this, we create the rule of thumb that an equation typically reduces the degrees of freedom of a system by 1 or sometimes by 0 if it is linearly dependent on another equation.

However, we can turn these 3 equations into 1 equation with the same solutions in a few ways.

$$F(x,y,z) \times G(x,y,z) \times H(x,y,z) = 0 \tag{1}$$

$${F(x,y,z)}^2 + {G(x,y,z)}^2 + {H(x,y,z)}^2 = 0 \tag{2}$$

Hence these equations must now reduce the degrees of freedom by 3, instead of 1. My question is, is there any way to calculate how many degrees of freedom an equation will use up? Also is there a name for studying this topic?

Also, is there any use for this trick? I'm aware that typically we branch equations. For example, $x^2+5x+6 = 0$ can factorise into $(x+2)(x+3)=0$ and then branch into two equations.

$$ (x+2)(x+3)=0 \qquad \rightarrow \qquad \left\{ \begin{matrix} x+2 = 0 \\ x+3=0 \end{matrix} \right. $$ Is it ever useful to go the other way? What about in non-linear systems?

Thank you in advance.

1

There are 1 best solutions below

4
On BEST ANSWER

A Very Technical Answer

Consider the function $F : \mathbb{R}^n \to \mathbb{R}^m$. Then we can think of the equation $F(x_1, \ldots, x_n) = 0$ as a fully general system of $m$ equations with $n$ unknowns.

Suppose moreover that $F$ is continuously differentiable and has constant rank near a point $a \in \mathbb{R}^n$ such that $F(a) = 0$. (For most problems you will encounter in the wild, $F$ will be continuously differentiable. For reasons beyond the scope of this answer, the constant rank assumption is also not much of an imposition.) The constant rank theorem says that there are open neighborhoods $U$ of $a$ and $V$ of $F(a) = 0$ and diffeomorphisms $u : \mathbb{R}^n \to U$ and $v : \mathbb{R}^m \to V$ such that $F(U) \subset V$ and such that $dF_a = v^{-1} \circ F \circ u$.

Let $A = \{x \in U : F(x) = 0\}$. The question "how many degrees of freedom are there near $a$" is the same as the question "what is the dimension of $A$." Since $u$ is a diffeomorphism, we may equivalently ask "what is the dimension of $u^{-1}(A) = \{y \in \mathbb{R}^n : u(y) \in A\} = \{y \in \mathbb{R}^n : F(u(y)) = 0\}$." Moreover, since $v$ is a diffeomorphism, we have \begin{align} u^{-1}(A) &= \{y \in \mathbb{R}^n : F(u(y)) = 0\} \\&= \{y \in \mathbb{R}^n : v^{-1}(F(u(y))) = v^{-1}(0) \} \\&= \{y \in \mathbb{R}^n : (v^{-1} \circ F \circ u)(y) = v^{-1}(0) \} \\&= \{y \in \mathbb{R}^n : dF_a (y) = v^{-1}(0) \} \\&= dF_a^{-1}(v^{-1}(0)) \end{align} where in the last line $dF_a^{-1}$ denotes the pre-image, not the inverse (which may not exist!). Let $p = v^{-1}(0)$. We have shown that the task of determining number of degrees of freedom of a set of m equations with n unknowns given by $F$ near $a$ is equivalent to determining the dimension of $dF_a^{-1}(p)$.

But determining the dimension of $dF_a^{-1}(p)$ is a standard linear algebra problem. If there exists $q$ such that $dF_a(q) = p$, then $dF_a^{-1}(p) = q + \ker dF_a$. Thus, the dimension of $dF_a^{-1}(p)$ is precisely $\dim \ker dF_a$. By rank-nullity, $\dim \ker dF_a = n - \text{rank } dF_a$.

Putting everything together, we may conclude that a system of $m$ equations in $n$ unknowns given by $F$ (near $a$, with the assumptions above) has $n - \text{rank } dF_a$ degrees of freedom. The case where we have $0$ degrees of freedom corresponds to the solution set being discrete.

As a final note, observe that at the start of this discussion we assumed that we could find a point $a$ such that $F(a) = 0$. If we can't, then of course there are no solutions and thus no degrees of freedom. As far as I know, there is no general way to check whether or not a system of equations admits solutions.

A Less Technical Answer

An equation can "use up" at most $1$ degree of freedom. Roughly, this is because an equation could let us write at most $1$ variable in terms of the other variables, reducing the number of variables left "free" by $1$.

But an equation may not use up any degrees of freedom. For example, suppose I have the equation $x + y = 0$. This uses up $1$ degree of freedom. If I further impose the equation $2x + 2y = 0$, I use up no additional degrees of freedom. The equation $2x + 2y = 0$ says the same thing as the equation $x + y = 0$. That's because the vectors formed by the coefficients of the first equation $(1,1)^T$ and the coefficients of the second equation $(2,2)^T$ are linearly dependent -- they point in the same direction. What happens when the equations are nonlinear?

An equation (in the sense we mean it here) is just a statement of the form $f(x_1, \ldots, x_n) = 0$. From multivariable calculus, we know the rate of change of $f$ in the direction $v \in \mathbb{R}^n$ is given by $\nabla f \cdot v$. Thus, if we move orthogonal to $\nabla f$, $f$ remains constant. Thus, if we take a tiny step away from $(x_1, \ldots, x_n)$ such that $f(x_1, \ldots, x_n) = 0$ in a direction orthogonal to $\nabla f$, we'll still be at a point that satisfies the equation. The number of directions we can step in while still satisfying the equations is exactly what we mean by "degrees of freedom." We see that as long as $\nabla f \neq 0$, one equation by itself reduces the number of degrees of freedom by 1. We are no longer allowed to move in the $\nabla f$ direction.

If we add another equation $g(x_1, \ldots, x_n) = 0$, we will only lose a degree of freedom if $\nabla g$ points in a direction different from $\nabla f$. If it points in the same direction, we'll have the same number of degrees of freedom, since the $\nabla f$ direction was already forbidden.

If we have $m$ equations $f_1(x_1, \ldots, x_n) = 0, \ldots, f_m(x_1, \ldots, x_n) = 0$, then the number of forbidden directions will be the number of independent directions the $\nabla f_i$ point in (note: by independent, I mean "linearly independent." If this is an unfamiliar concept, think of the following example: north and west are independent, but north, west, and northwest are not independent, since I can already move northwest by moving north and then west). Thus, the number of degrees of freedom will be the number of directions I could move before there were any equations, $n$, minus the number of directions the equations forbid. The number of directions the equations forbid is nothing but the rank of $dF$, where $F : \mathbb{R}^n \to \mathbb{R}^m$ is given by $F(x_1, \ldots, x_n) = (f_1(x_1, \ldots, x_n), \ldots, f_m(x_1, \ldots, x_n))$ and $dF$ is the matrix whose rows are the $\nabla f_i$ (I should say $df_i$ throughout here, but I'm using more familiar, albeit less accurate, notation for pedagogical reasons).

Answering Other Questions in Your Post

As far as I know there isn't a name for this topic, but these kinds of questions are included in the study of manifolds.

The branching of equations you're talking about at the end of your post is a somewhat different mathematical idea. The equation $x^2 + 5x + 5 = 0$ implies $x + 2 = 0$ OR $x + 3 = 0$, which has solutions $-2$ and $-3$. Usually when we talk about systems of equations, the equations have an implicit AND between them. $x + 2 = 0$ AND $x + 3 = 0$ has no solution.

I had never thought about it, but it is useful to "go the other way." If I have a bunch of equations which are connected by ORs, then I can package them as one equation by multiplying them together. For example, if I want to say $f_1(x,y) = 0$ or $f_2(x,y) = 0$, then it suffices to say $f_1(x,y) f_2(x,y) = 0$. If I wanted to further require that $f_3(x,y) = 0$, then I would form the system of equations $f_1(x,y) f_2(x,y) = 0$ and $f_3(x,y) = 0$.