Given the minimization problem
$$\underset{x_1, x_2}{\text{minimize}} \quad f (x_1, x_2) := x_1^2 + y x_2^2 - x_1 x_2 - x_1 - x_2$$
My algorithm is working fine for a specific range of $y$ and for some specific range of $y$, for example $y = -1$, my gradient descent algorithm is not terminating irrespective of the initial step size and initial point chosen.
I am not able to decode that why this is happening.
I tried to evaluate the Hessian matrix and for $y = -1$, it comes out to be negative. That means, this function is neither convex nor concave. Can this be the reason why it is not terminating?
I read about this on internet and found that gradient descent algorithm can be applied to functions in general irrespective of them being convex or concave. I also tried to print the values of function at each iteration. It seems that values are not converging rather diverging and overflow happened after millions of iterations. As I said, this is only happening when Hessian is negative. Please help!
$$\frac{\partial f }{\partial x_1} =2x_1 - x_2 -1 $$ $$\frac{\partial f }{\partial x_2} =-x_1 + 2yx_2 -1 $$
So optima or saddle points could occur when both of these are zero:
$$2x_1 - x_2 =1 $$ $$-x_1 + 2yx_2 =1 $$
Computing the Hessian (the matrix of second derivatives):
$$H= \pmatrix{2 &-1\\-1 &2y},$$
we see that the Hessian is zero when $y=\frac{1}{4}$ and positive for $y>\frac{1}{4}$.
Thus, for $y>\frac{1}{4}$, a minimum will exist. For $y<\frac{1}{4}$, the function has a saddle point and no minimum. For $y=\frac{1}{4}$, higher order tests are required, so says https://mathworld.wolfram.com/SecondDerivativeTest.html.