Sturm's theorem for the number of real roots

Question

Sturm's theorem for the number of real roots

559 Views Asked by user757026 At 08 Apr 2026 - 12:49

If we have Sturm's sequence of polynomials, $p_0=p, p_1, ..., p_m$, for a given polynomial $p$, the number of real roots of $p$ in some half-closed interval $(a,b]$ is $W(a)-W(b)$, where $W$ is function that takes real number $x$ and gives us the number of sign changes in Sturm's sequence, evaluated at $x$.

In order to prove the theorem, we watch what happens to $W$ as $x$ moves from left to right on $x-$axis. We pick some interval $(a-\epsilon, a+ \epsilon)$ where none of the polynomials in Sturm's sequence is zero, except maybe at $a$. We separate the cases when $a$ is zero of some $p_i, i>0$ and when $p_0(a)=0$.

In the first case, we prove that there is the same number of sign changes in triple $(p_{i-1}, p_i, p_{i+1})$ as $x$ moves across $a$. What I don't understand is how this proves that the number of sign changes stays the same not just when we count it for triple $(p_{i-1}, p_i, p_{i+1})$, but for the whole sequence. What if there is some $p_j, i \neq j$, that is zero at $a$ ? We can apply the proven for triple surrounding $p_j$, but it's not obvious to me that if there is no change of $W$, evaluated at polynomials in groups of three, then there is no change of $W$ evaluated at every polynomial in the sequence.

Also, in the second case, we prove that the number of sign changes drops by $1$ when we cross the zero of $p_0$, but we only prove it for pair $(p_0, p_1)$. How to move from here to counting the number of sign changes in the whole sequence ?

I've searched online and all the proofs are same. They skip the part that I'm talking about (or maybe it's obvious and I don't see it).

Original Q&A

There are 2 best solutions below

**Bumbble Comm** · Answer 1 · 2020-03-07 04:21:36

Let $\sigma(u,v) = 1$ if $u < 0 < v$ or $v < 0 < u$, and $\sigma(u,v) = 0$ otherwise, then for a sequence $\{y_i\}_{i=0}^N$ of non-zero values, $$\sum_{i=1}^N \sigma(y_{i-1}, y_i)$$ gives the number of sign variations in the sequence $\{y_i\}$. Unfortunately, it doesn't work when some of the $y_i$ are zero, but I'll work around that.

In particular, if $P = \{p_i\}_{i=0}^N$ is any sequence of non-zero polynomials, then $$W_P(x) = \sum_{i=1}^N \sigma(p_{i-1}(x), p_i(x))$$ except where one or more $p_i(x) = 0$. But that is only at a finite number of isolated points. If on some interval, $p_{i-1}, p_i$ are never $0$, then neither changes sign, so $\sigma(p_{i-1}, p_i)$ is constant on the interval. If $p_i(c) = 0$ for some $c$, there is an open interval about $c$ such that $p_{i-1}, p_i, p_{i+1}$ are not zero everywhere other than at $c$. So to the left of $c$, and to the right of $c$ the number of sign variations in this triple is $$T_i(x) := \sigma(p_{i-1}(x), p_i(x)) + \sigma(p_i(x), p_{i+1}(x))$$ At $c$ itself, the number of sign variations will be $$T_i(c) := \sigma(p_{i-1}(c), p_{i_1}(c))$$ since $p_i(c) = 0$. The basic result is

Lemma: if $P = \{p_i\}_{i=0}^N$ is a sequence of non-zero polynomials satisfying the conditions

for all $i>0$ and $x$, if $p_{i-1}(x) = 0$, then $p_i(x) \ne 0$. I.e., no two adjacent polynomials share a zero.
for all $i, 0< i < N$ and for all $c$, if $p_i(c) = 0$, then $T_i(x)$ is constant on a neighborhood of $c$.

Then $W_P(x)$ is constant on all intervals that do not include a zero of $p_0$ or $p_N$.

Proof: As noted above the $\sigma(p_{i-1}, p_i)$ are all constant on intervals that do not contain zeros of any of the polynomials, so their sum $W_P(x)$ will also be constant on those intervals. The only place where it can change value is at points $c$ where at least one of the polynomials equals $0$. Since there are only a finite number of such roots, they are isolated from each other.

Let $(a,b)$ be an interval not including any zeros of the two end polynomials and only one zero $c$ of the remaining polynomials. Near $c$ we can divide the indices into two sets $A = \{i\mid p_{i-1}(c) = 0\text{ or } p_i(c) = 0\}, B = \{1, \ldots N\} \setminus A$. Then $$W_P(x) = \sum_{i\in A} \sigma(p_{i-1}(x), p_i(x)) + \sum_{i\in B} \sigma(p_{i-1}(x), p_i(x))$$ Since neither polynomial in the sum ober $B$ is $0$ near $c$, every term, and therefore the sum, is constant. Since no polynomials that are $0$ at $c$ are adjacent, we can rewrite the sum over $A$ as
$$\sum_{i\in A} \sigma(p_{i-1}(x), p_i(x)) = \sum_{p_i(c) = 0} \sigma(p_{i-1}(x), p_i(x)) + \sigma(p_i(x), p_{i+1}(x)) = \sum_{p_i(c) = 0} T_i(x)$$ But by the hypothesis, $T_i(x)$ is constant near $c$ for each $i$ with $p_i(c) = 0$. So the sum over $A$, and therefore $W_P$, are both constant near $c$. Since $W_P$ is constant between any two zeros of the inner polynomials, and also in neighborhoods of those zeros, it must be constant over the entire interval $(a,b)$. QED

The only reason the argument doesn't work for zeros of $p_0$ and $p_N$ is that there is no polynomial on one side to form one of the triples.

Now, given a square-free polynomial $p_0$, the Sturm sequence satisfies the recursion $$p_{i-1} + p_{i+1} = q_ip_i$$ for some polynomials $q_i$. If $p_i(c) = 0$ and either of $p_{i-1}(c)$ or $p_{i+1}(c)$ is also zero, then the third polynomial is zero as well. Thus if the Sturm sequence has two adjacent polynomials that are $0$ at $c$, then every polynomial in the sequence must also be equal to $0$ at $c$. This includes $p_0$ itself and $p_1 = p_0'$. But a square-free polynomial and its derivative cannot share a zero. So this cannot occur. Therefore no two adjacent polynomials in the Sturm sequence share a common zero.

Now if $p_i(c) = 0$, we must have $p_{i+1}(c) \ne 0, p_{i-1}(c) \ne 0$, but $p_{i-1}(c) + p_{i+1}(c) = q_i(c)p_i(c) = 0$. Therefore $p_{i+1}(c) = -p_{i-1}(c) \ne 0$, so there must be some neighborhood of $c$ in which $p_{i+1}$ and $p_{i-1}$ are of opposite signs. In this neighborhood except at $c$ itself, $p_i$ must agree in sign with one or the other. Thus there is exactly $1$ sign variation between the three polynomials anywhere in the neighborhood. I.e. $T_i$ is constant on the neighborhood.

So Sturm sequences satisfy both conditions of the lemma, and $W_P$ is constant between zeros of $p_0$ and $p_N$. The final piece of the puzzle is this: For a square-free Sturm sequence, $p_N$ is never $0$. Since $p_N$ is the last polynomial in the sequence, the next remainder must be $0$. But that meant $p_{N-1} = q_Np_N$. So if $p_N(c) = 0$, then so also $p_{N-1}(c) = q_N(c)p_N(c) = 0$. And as indicated above, this means that all polynomials in the sequence are $0$ at $c$, contradicting that $p_0$ is square-free.

Since $p_N$ has no zeros, $W_P$ can only change values at the zeros of $p_0$.

The argument breaks down when $p_0$ is not square-free. I have not worked out how the proof must change in this case. Wikipedia indicates that the result in the non-square-free case is only a little more restrictive.

**user55085** · Answer 2 · 2020-08-06 20:58:06

Sturm's sequence $(p_0,p_1,\dots,p_m)$ for a polynomial $p$. (Actually, $p$ is not $p_0$ in general, but it does not matter here.).

Fact 1. If $p_0(a)=0$, then the number of sign variation of the first pair $(p_0,p_1)$ drops by 1 when $x$ increases from $x\in (a-\epsilon,a)$ to $x=a$ , and does not change when $x$ increases from $x=a$ to $x\in (a,a+\epsilon)$.

Note that Fact 1 is stronger than the OP claimed in the question statement, which will explain why the type of the interval is $(a,b]$.

Fact 2. If $p_{i}(a)=0$ for $1\leq j\leq m-1$, then $p_{i-1}(a)p_{i+1}(a)<0$, which implies the number of sign variations in the triple $(p_{j-1},p_j,p_{j+1})$ does not change.

Fact 3. By definition of $p_m$, it is a nonzero constant and never changes sign.

Case 1. $p_0(a)=0$.

Step 1. According to Fact 1, as $x$ moves from left to right across $x$, the number of sign variations for the first pair $(p_0,p_1)$ drops by 1. The contribution of $(p_0,p_1)$ to the change in $W$ is $-1$.

Step 2. Consider any $i$ such that $p_i(a)=0, 1\leq i\leq m-1$. According to Fact 2, one can prove that the number of sign variations for the triple $(p_{i-1},p_i,p_{i+1})$ does not change. The contribution of $(p_{i-1},p_i,p_{i+1})$ to the change in $W$ is $0$.

Step 3. Only those pairs $(p_i,p_{i+1})$ for which both $p_i(a)$ and $p_{i+1}(a)$ are nonzero remain to be checked. This is because:

i. If $p_i(a)=0$, then we have already considered the contribution of $(p_i,p_{i+1})$ to the change in $W$ in the triple $(p_{i-1}, p_i, p_{i+1})$ for $1 \leq i\leq m-1$, or it's just the pair $(p_0,p_1)$ for $i=0$.

ii. If $p_{i+1}(a)=0$, then we have already considered the contribution of $(p_i,p_{i+1})$ to the change in $W$ in the triple $(p_i,p_{i+1},p_{i+2})$ for $1\leq i\leq m-2$.

And we don't need to worry about $p_{i-1}$ and $p_{i+2}$, because no matter if they are zero, the involved pairs are $(p_{i-1},p_i)$ (maybe in the triple $(p_{i-2},p_{i-1},p_i)$) and $(p_{i+1},p_{i+2})$ (maybe in the triple $(p_{i+1},p_{i+2},p_{i+3})$), respectively, not the pair $(p_i,p_{i+1})$ we are now considering.

But since $p_i(a)$ and $p_{i+1}(a)$ are both nonzero, they don't change sign as $x$ moves from $x\in (a-\epsilon,a)$ to $x\in (a,a+\epsilon)$. Therefore the number of sign variations for the pair $(p_i,p_{i+1})$ doesn't change and its contribution to the change in $W$ is $0$.

Step 1~3 cover all possible changes in sign variations, either due to pairs in a triple or as individual pairs. The total change in $W$ is $-1+0+0=-1$, and $W$ drops by 1 as $x$ increases from $x\in (a-\epsilon,a)$ to $x\in (a,a+\epsilon)$.

Case 2. $p_0(a)\neq 0$.

The only difference from case 1 is Step 1, where the contribution of the pair $(p_0,p_1)$ to the change in $W$ becomes $0$. It follows that in this case $W$ does not change as $x$ increases from $x\in (a-\epsilon,a)$ to $x\in (a,a+\epsilon)$.

Sturm's theorem for the number of real roots

There are 2 best solutions below

Related Questions in POLYNOMIALS

Related Questions in ROOTS

Trending Questions

Popular # Hahtags

Popular Questions