Numerically computing $\textrm{min}_{(x,y)\in\mathbb{R^2}}\left(\frac{x^2}{2}+\frac{3y^2}{2}+xy+y-x\right)$

49 Views Asked by At

Im practicing for an exam and this question popped up on an old exam. the following are the subquestions:

a) Show that $s_0=(1,-1)^T$ is a descent direction in $x_0=(0,0)^T.$

b) Determine $x_1$ by exactly solving the linesearching problem with startingpoint $x_0$ and search direction $s_0$.

c) What happens if you solve the optmization problem with Newtons method for multidimensional optimiztion?

Here is what the examinator proposes:

a) We have that

$$\frac{d}{da}f(x_0+as_0) \rvert_{a=0} = \frac{d}{da}\left(\frac{a^2}{2}+\frac{3a^2}{2}-a^2-2a\right)\Bigg\rvert_{a=0}=-2<0.$$

The direction derivative is negative in that direction, thus it is a descent direction.

Question 1: I understand that we have to show that the direction derivative is less than zero in order for it to be a descent, but why are they choosing the stepsize $a=0?$ We haven't been given the optimal stepsize yet, according to the book I'm supposed to find it using the jacobian and hessian.

Why are we even interested in a function in terms of $a$?


b) $f(x_0+as_0)=a^2-2a,$ so the minimal value is attained when $a=1,$ then $x_0+as_0=(1,-1)^T.$

Question 2: I think the question I have here is the result of the lack of understanding of a). I don't understand when I can use the derivative like this to find the stepsize and when I should use the fact that

$$a_k=-\frac{\nabla f(x_k)^Ts_k}{s_k^THs_k},$$

where $H= \ $Hessian.


c) This is an optimization problem with a quadratic objectfunction. Such will Newtons method solve exactly in one step.

Question 3: Why?

1

There are 1 best solutions below

4
On BEST ANSWER
  1. Note that we are not setting $\alpha=0$ to the function, we are setting it into the derivative. So it is not a step size for the minimization, it is the way to calculate the derivative at the point. Descent direction means that the directional derivative at $x_0$ is negative. After calculating the derivative we need $\alpha=0$ to get the point $x_0+\alpha s_0=x_0$ which we are interested in.
  2. It does not matter if you use the formula for $\alpha$ or derivate $f(x_0+\alpha s_0)$ and calculate the critical point. It will give you the same result for quadratic functions. To derivate $a^2-2a$ feels easier here. Note that it works only for quadratic functions with pos.def. Hessian, otherwise what you get is not the minimum.
  3. Because the core of the Newton method is to find the quadratic approximation of the function from the Taylor expansion and minimize it instead of the function. Since the function is quadratic, it is its own quadratic approximation, so Newton gives you the exact minimum of the function itself. Note again that the Hessian should be pos.def., otherwise your critical point is not the minimum.