A curious question about optimizing a function of 2 variables.

71 Views Asked by At

Let $f(x,y)$ be defined and has continuous first and second partials on a domain $D$. Also, let

$$A = \frac{\partial^2 f}{\partial x^2} \\ B = \frac{\partial^2{f}}{\partial x \partial y} \\ C = \frac{\partial^2 f}{\partial y ^2}$$

Let say, at a critical point of $f(x,y)$, $$A = -1 \\ B = 0 \\ C = -1$$ So the second directional derivative in the direction of a unit vector $u$ can be written as a quadratic form,

$$\nabla_u\nabla_u f = u^T Q u \\ Q = \begin{bmatrix}-1 & 0 \\ 0 & -1\end{bmatrix}$$ $B^2 - AC = 0 - 1 < 0$ and $A + C = -1 -1 = -2 < 0$ So we know, that $f(x,y)$ must have a relative maximum at that critical point. But can someone tell me how to conclude this based on the eigenvalues of the quadratic forms?

Because normally, if the quadratic form has 2 distinct eigenvalues of the same sign say (positive), i can say that the second directional derivative has positive minimum and will be positive for all $u$, but for this case, the eigenvalues are repeated and hence the second directional derivative will have only 1 extremum point (which i dont know whether is a maximum or minimum). So how to come to a conclusion that f(x,y) has a relative maximum based on the repeated eigenvalues which has value $-1$?

2

There are 2 best solutions below

0
On BEST ANSWER

When $(x_0,y_0)$ is your critical point then from the given data we can conclude that $$f(x_0+X,y_0+Y)=f(x_0,y_0)-{1\over2}(X^2+Y^2)+o(|{\bf Z}|^2)\qquad\bigl({\bf Z}:=(X,Y)\to{\bf 0}\bigr)\ .$$ It follows that $$f(x_0+X,y_0+Y)-f(x_0,y_0)=-{1\over2}|{\bf Z}|^2\bigl(1+o(1)\bigr)<0$$ as soon as $|{\bf Z}|$ is sufficiently small. Therefore we have a strict local maximum at $(x_0,y_0)$.

0
On

The results about optimization of $f(x,y)$ rest on the multivariate Taylor formula: $$ f(x,y) = f(a,b)+(\nabla f)(a,b) \cdot \langle x-a,y-b \rangle+ Q(a,b)(x-a,y-b) + \cdots $$ Here the quadratic term $Q(a,b)$ has the explicit form: $$ Q(a,b)(h,k) = \tfrac{1}{2}\bigl(f_{xx}(a,b)h^2+2f_{xy}hk+ f_{yy}k^2\bigr) $$ where you can think $h=x-a$ and $k=y-b$. A critical point is where the gradient either does not exist or vanishes. Some authors call the existent case a stationary point. A stationary point $(a,b)$ has $(\nabla f)(a,b) = \langle 0,0 \rangle$. Furthermore, the multivariate Taylor expansion at a stationary point has the form: $$ f(x,y) = f(a,b)+ Q(a,b)(x-a,y-b) + \cdots $$ If $||\langle x-a,y-b \rangle || <<1$ (suppose we are close to the stationary point) then terms of higher order than $Q$ are usually dominated by the values of $Q$. The only exception is if $Q$ admits and eigenvalue $0$. In that case, there is some direction $v$ in which $Q(a,b)(v)=0$ and along that direction the third order and higher terms can lead to a min, max, saddle or trough (locally a cylinder).

Ok, getting back to the case in which $Q(a,b)$ has nonzero eigenvalues. If $\{ v,w \}$ is an orthonormal eigenbasis for $Q$ and $y_1,y_2$ are coordinates with respect to that basis then $Q(\lambda_1 y_1v+\lambda_2 y_2w) = \lambda_1y_1^2+\lambda_2y_2^2$.

  1. If $\lambda_1,\lambda_2>0$ then $f(\lambda_1 y_1v+\lambda_2 y_2w)=f(a,b)+\lambda_1y_1^2+\lambda_2y_2^2$ will be smallest at the stationary point since $y_1,y_2 \neq 0$ adds something positive to the value.
  2. If $\lambda_1,\lambda_2<0$ then $f(\lambda_1 y_1v+\lambda_2 y_2w)=f(a,b)+\lambda_1y_1^2+\lambda_2y_2^2$ will be largest at the stationary point since $y_1,y_2 \neq 0$ adds something negative to the value.
  3. If $\lambda_1\lambda_2 <0$ then $f(\lambda_1 y_1v+\lambda_2 y_2w)=f(a,b)+\lambda_1y_1^2+\lambda_2y_2^2$ will be larger and smaller than $f(a,b)$ near the stationary point since one eigendirection adds positive quantities whereas the other adds negative quantities.
  4. If $\lambda_1\lambda_2 =0$ then other analysis is required.

Finally, some simple examples of each case:

  1. Let $f(x,y) = x^2+(y-2)^2$ has local minimum at stationary point $(0,2)$
  2. Let $f(x,y) = -x^2-(y-2)^2$ has local maximum at stationary point $(0,2)$
  3. Let $f(x,y) = x^2-y^2$ has saddle point at stationary point $(0,0)$
  4. Let $f(x,y) = \cos(x^2+y^2) = 1-\frac{1}{2}(x^2+y^2)^2+ \cdots$ fails to be captured by quadratic analysis. However, the graph below shows that $f(0,0)=1$ is indeed a local maximum (which is also not hard to see if we just think about cosine a bit)graph of $z=\cos(x^2+y^2)$