Why are $∇f(x) = 0$, and $x^T H_x > 0$ necessary conditions for $x^$ to be minimum?

Question

Why are $∇f(x) = 0$, and $x^T H_x > 0$ necessary conditions for $x^$ to be minimum?

43 Views Asked by Bumbble Comm At 28 Apr 2026 - 5:38

See the book OPTIMIZATION: Algorithms and Applications by Rajesh Kumar Arora, Page-$56$.

... ...

The necessary conditions for $x^*$ to be a minimum are that $$∇f(x*) = 0 \qquad (3.1)$$

and $x^T Hx $ is positive definite $(x^T H x > 0)$. To ensure this, eigenvalues of $H$ are to be positive. Consider a two-variable function $$f_x = {x_1}^2 + {x_2}^2 - 2x_1 \qquad (3.2)$$

... ...

I have three questions:

What is $Hx$?
What does it mean by the term positive-definite?
Why are (a) $\displaystyle ∇f(x*) = 0$, and (b) $\displaystyle x^T H x > 0$ necessary conditions for $x^*$ to be minimum?

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

$1$. What is $H$?

The matrix $H$ is referred to as the Hessian matrix, i.e.

$$H = \begin{bmatrix} \dfrac{\partial^2 f}{\partial x_1^2} & \dfrac{\partial^2 f}{\partial x_1\,\partial x_2} & \cdots & \dfrac{\partial^2 f}{\partial x_1\,\partial x_n} \\[2.2ex] \dfrac{\partial^2 f}{\partial x_2\,\partial x_1} & \dfrac{\partial^2 f}{\partial x_2^2} & \cdots & \dfrac{\partial^2 f}{\partial x_2\,\partial x_n} \\[2.2ex] \vdots & \vdots & \ddots & \vdots \\[2.2ex] \dfrac{\partial^2 f}{\partial x_n\,\partial x_1} & \dfrac{\partial^2 f}{\partial x_n\,\partial x_2} & \cdots & \dfrac{\partial^2 f}{\partial x_n^2} \end{bmatrix}$$

$2$. What does it mean by the term positive-definite?

Positive definite means that if you grab any vector $z$, the term $z^T H z$ is always positive for any non-zero $z$. Informally speaking, it is a way of saying that your matrix is "positive". To see that, take a scalar and say it is positive-definite, it just turns out that the scalar is positive.

$3$. Why are (a) $∇f(x*) = 0$, and (b) $x^T H x > 0$ necessary conditions for $x^*$ to be minimum?

In 1-dimensional functions, so that you have a local minimum around $x^*$, wiggling around $x^*$ should make your function larger, right? This means that we must have $$f(x^*+\epsilon )\gt f(x^*)\Rightarrow f(x^*+\epsilon )- f(x^*)\gt 0$$ Taylor series tells us that $$f(x^*+\epsilon)\approx f(x^*)+\frac{df}{dx}|_{x^*}\epsilon+\frac 12\frac{d^2f}{dx^2}|_{x^*}\epsilon^2+O(\epsilon^3)$$ We can substitute the approximation of $f(x^*+\epsilon)$ to the definition of minimum $$f(x^*+\epsilon )- f(x^*)=f(x^*)+\frac{df}{dx}|_{x^*}\epsilon+\frac 12\frac{d^2f}{dx^2}|_{x^*}\epsilon^2-f(x^*)\gt 0$$ $$\frac{df}{dx}|_{x^*}\epsilon+\frac 12\frac{d^2f}{dx^2}|_{x^*}\epsilon^2\gt 0$$

Now in $n-$dimensional, which is your case, (note that $\epsilon$ becomes a vector now) $$\nabla f(x^*)\epsilon+ \epsilon^T \frac 12\nabla^2 f(x^*)\epsilon\gt 0$$ We now use the assumption that $\nabla f(x^*)=0$, which gives $$\epsilon^T \nabla^2 f(x^*)\epsilon\gt 0$$ Since the term $\epsilon^T A \epsilon\gt 0$ is always positive definite for every $\epsilon$ the condition for local minimum becomes $$\nabla^2 f(x^*)\gt 0$$

Why are $∇f(x) = 0$, and $x^T H_x > 0$ necessary conditions for $x^$ to be minimum?

There are 1 best solutions below

Related Questions in CALCULUS

Related Questions in OPTIMIZATION

Trending Questions

Popular # Hahtags

Popular Questions

Why are $∇f(x*) = 0$, and $x^T H_x > 0$ necessary conditions for $x^*$ to be minimum?

There are 1 best solutions below

Related Questions in CALCULUS

Related Questions in OPTIMIZATION

Trending Questions

Popular # Hahtags

Popular Questions

Why are $∇f(x) = 0$, and $x^T H_x > 0$ necessary conditions for $x^$ to be minimum?