Minimizing the maximum of the absolute value of a monic and quadratic polynomial

784 Views Asked by At

I'm considering the following problem $$ \min_{t_0,t_1\in I}\max_{t\in I} |\omega(t;t_0,t_1)|, $$ where $\omega(t;t_0,t_1) = (t-t_0)(t-t_1)$ and $I=[-1,1]$.

I know that the solution is $t_0=-t_1 = 1/\sqrt 2$ and I am looking for a proof with basic and easy to understand methods.

My notes say that since "the problem is symmetric" (I suppose $\omega(t;t_0,t_1) = \omega(t;t_1,t_0)$ is meant), it follows that $t_1 = -t_0$. But I don't get the argument and I don't want to use a complicated theorem from optimization.

For context: The problem arises, when trying the minimize the remainder term of polynomial interpolation for two interpolation nodes. It is known that in order to minimize it, one has to use the zeros of the Chebyshev polynomials (linearily transformed to the interval in question).

2

There are 2 best solutions below

0
On BEST ANSWER

The following explanation will be a bit "informal"; it is meant to provide the intuition, rather than a rigorous proof. I will also use $\omega(t;t_1,t_2)$ to denote $|(t-t_1)(t-t_2)|$, rather than the original expression without the absolute value.

First of all, the "intuitive" explanation is that if $t_1$ and $t_2$ are chosen and we add the same value to both, the graph of our function will not change in height, but just move left or right. The graph is symmetrical about axis $x=(t_1+t_2)/2$ and moving it such that this axis lies right in the middle of the interval $[-1,1]$ is as good as it gets when it comes to minimizing the maximum over this interval (this is precisely due to the symmetry; if it was not right in the middle, shifting it towards the middle would, at worst, keep the maximum the same, but it could also lower it).

Now, let's try some slightly more involved reasoning (still not rigorous, though)!

Start by considering an arbitrary function $f(x)$ satisfying two properties:

  • It is reasonably well-behaved; in particular, it has a well-defined maximum over any closed interval.
  • It is symmetric about the axis $x=s$. Formally: $f(s+\Delta)=f(s-\Delta)$ for any $\Delta>0$.

These two properties allow us to draw some conclusions about $f$'s maximum within certain intervals. Let $[s-a,s+b]$ be any closed interval of length $L=(a+b)$ which contains $s$. We then have the following chain of (in-)equalities $$\begin{eqnarray} \max_{x\in[s-a,s+b]} f(x) & = & \max\left(\max_{x\in[s-a,s]} f(x), \max_{x\in[s,s+b]} f(x)\right)\\ & = & \max_{x\in[s,s+\max(a,b)]} f(x) \\ & \geq & \max_{x\in[s,s+L/2]} f(x) \\ & = & \max_{x\in[s-L/2,s+L/2]} f(x) \end{eqnarray}$$

The second equality is based on symmetry of $f(x)$ about $x=s$; the longer of the intervals $[s-a,s]$ and $[s,s+b]$ contains a mirror image of the shorter one, so the maximum of $f(x)$ over it must be equal or greater than the maximum over the shorter one. The inequality follows from observing that $(a+b)=L$ implies $\max(a,b)\geq L/2$. Finally, the last equality is once again based on the symmetry.

Now, what does this actually tell us about $f(x)$, in simple terms? If we are trying to find the maximum of $f(x)$ over an interval of length $L$ which contains $s$, this maximum would certainly not be smaller than the maximum over the interval in which $s$ lies exactly in the middle. Thus, if we are trying to minimize the maximum by moving the interval (but keeping $s$ inside it!), we won't be able to get any better than the interval symmetric about $s$. Note that this does not mean that this is the only interval which achieves the minimum; it only implies the value of the minimum is as small as it gets.

Our function $\omega$ actually has those two properties! It is certainly well-behaved, and it also has an axis of symmetry, defined by $s=(t_1+t_2)/2$. Thus, if we are looking at any interval of length $2$ which contains $s$ to find the one which minimizes the maximum of $f(x)$, it wouldn't be possible to get any better than the interval $[s-1,s+1]$.

But... this seems like the opposite of what we are allowed to do! We are free to move $t_1$ and $t_2$, but the interval is fixed: $[-1,1]$. This is indeed the case; and it is also precisely the reason why the symmetry is insufficient on its own (as LinAlg demonstrated in the comment to the question).

Fortunately, our function $\omega$ has another important property: We can shift it around by adjusting $t_1$ and $t_2$ appropriately. Formally, we have (for any choice of the variables) $$\omega(t+\Delta;t_1,t_2)=\omega(t;t_1-\Delta;t_2-\Delta)$$

Armed with this property, we can move the interval around and conclude that the unbeatable minimum is actually reachable, by shifting the function so that its axis of symmetry becomes $x=0$. This would actually be optimal even for $t_1$, $t_2$ outside of the interval $[-1,1]$, as long as $(t_1+t_2)/2$ falls into the $[-1,1]$ interval.

As a final remark: The "translation" property we used can be described in a more compact form as: $$\omega(t;t_1,t_2)=h(t_1-t,t_2-t)$$ for some function $h$. In other words, $\omega$ does not depend on $t_1$ or $t_2$ directly, but always in combination with $(-t)$. This precludes functions like the one suggested by LinAlg.

0
On

Consider the inner maximization problem: $$ \max_{t\in I} |(t-t_0)(t-t_1)|=\begin{cases}(t-t_0)(t-t_1) & \text{if } t<\min\{t_0,t_1\} \lor t>\max\{t_0,t_1\}\\-(t-t_0)(t-t_1) &\text{otherwise.}\end{cases} $$ This function has three potential maxima: at $x=-1$, at $x=1$ and at $x=(t_0+t_1)/2$. The function values at these potential maxima are $(1+t_0)(1+t_1)$, $(1-t_0)(1-t_1)$ and $0.25(t_1 - t_0)^2$.

The outer problem is therefore: $$\begin{align}&\min_{t_0,t_1\in I}\max\{(1+t_0)(1+t_1), (1-t_0)(1-t_1), 0.25(t_1 - t_0)^2\}\\ =&\min_{t_0 \in I} \min_{t_1\in I}\max\{(1+t_0)(1+t_1), (1-t_0)(1-t_1), 0.25(t_1 - t_0)^2\}.\end{align} $$ Now consider the inner minimization problem (for a fixed value of $t_0$). The potential minima are $t_1=-1$ and $t_1=1$ (endpoints), $t_0=t_1$ (stationary points) and $t_1=-t_0$, $t_1 =3t_0 + 2 \pm 2\sqrt{2}(t_0+1)$ and $t_1 = 3t_0 - 2 \pm 2\sqrt{2}(t_0-1)$ (potential breakpoints). At this point it helps to create a plot of the three functions within the max operator for different values of $t_0$, with $t_1$ on the horizontal axis (the first function is shown in blue, the second in red, and the third in orange): function plot

The minimum occurs at the breakpoint between the first two functions ($t_0=-t_1$). when $t_0 \in [-1/\sqrt{2}, 1/\sqrt{2}]$, between the second and third function ($t_1 = 3t_0 - 2 - 2\sqrt{2}(t_0-1)$) for $t_0<-1/\sqrt{2}$, and between the first and third function ($t_1 = 3t_0 + 2 - 2\sqrt{2}(t_0-1)$) for $t_0>1/\sqrt{2}$. You can now plot the value of the minimization problem as a function of $t_0$: function plot It is now clear that the optimum occurs at the point you found.