The quadratic case in nonlinear programming

152 Views Asked by At

I'm reading about nonlinear programming and I stumbled into the following statement where I started to wonder a bit:

Consider the function

$$f(\textbf{x}) = \frac{1}{2}\textbf{x}^T\textbf{Q}\textbf{x}-\textbf{x}^T\textbf{b},$$

where $\textbf{Q}$ is a positive definite symmetric $n\times n$ matrix. Since $\textbf{Q}$ is positive definite, all of its eigenvalues are positive. We assume that these eigenvalues are ordered: $0<a=\lambda_1\leq \lambda_2, ..., \leq \lambda_n = A.$ With $\textbf{Q}$ positive definite, it follows (from proposition 5) that $f$ is strictly convex.

Proposition 5

Let $f\in C^2$. Then $f$ is convex over a convex set $\Omega$ containing an interior point if and only if the Hessian matrix $\textbf{F}$ of $f$ is positive semidefinite throughout $\Omega$.

My question is: Does the statement above imply then that $\textbf{Q}$ is the Hessian of $f$ and not just some arbitrary positive definite $n\times n $ matrix? This was not explicitly stated in the text of my book...

The book is : Linear and Nonlinear programming 3rd edition, Luenberger, page 235

Thank you for any help! =)

P.S. here you can see the same statement from the book:

enter image description here

2

There are 2 best solutions below

0
On BEST ANSWER

As a complement to Henning's answer, here is the direct computation showing that $\mathbf{Q}$ is by construction the Hessian of $f(\mathbf{x})$.

For convenience I will work in index notation and employ the Einstein summation convention (i.e. indices appearing twice are summed over). Since the $ij$th element of the Hessian of $f(\mathbf{x})$ is by definition $\partial_i \partial_j f$, we have \begin{align} \partial_i \partial_j f &= \partial_i \partial_j\left( \frac{1}{2}x_k Q_{kl} x_l-x_k b_k\right)\\ &= \partial_i\left(\frac{1}{2} \delta_{j k} Q_{kl} x_l+\frac{1}{2} x_k Q_{kl} \delta_{jl}-\delta_{jk}b_k\right)\\ &=\frac{1}{2}Q_{kl}\left(\delta_{i l}\delta_{j k}+\delta_{j l}\delta_{i k}\right)-\delta_{j k}\cdot 0\\ &= \frac{1}{2}\left(Q_{j i}+Q_{i j}\right)=Q_{ij} \end{align}

where we have made use of $\partial_i x_j = \delta_{ij}$ in order to eliminate implied summations as wel as $Q_{ij}=Q_{ji}$ being a symmetric matrix.

1
On

The Hessian matrix $H$ is the symmetric matrix that works as the coefficient of the quadratic term in the Taylor expansion of $f$:

$$f(\mathbf{x}+\mathbf{h}) = f(\mathbf{x}) + J(\mathbf{x})\mathbf{h} +\frac{1}{2} \mathbf{h}^T H(\mathbf{x}) \mathbf{h} + o(|\mathbf h|^2)$$

(where the Jacobian $J(\mathbf x)$ is just a row matrix when $f(\mathbf x)$ is a scalar).

If you expand your $f$ in the argument $\mathbf x+\mathbf h$ you get

$$\begin{align} f(\mathbf x+\mathbf h) &= \tfrac12 (\mathbf x+\mathbf h)^T Q (\mathbf x+\mathbf h) + (\mathbf x+\mathbf h)^T\mathbf b \\ &= \tfrac12 \mathbf x^T Q \mathbf x + \tfrac12 \mathbf h^T Q \mathbf x + \tfrac12 \mathbf x^T Q \mathbf h + \tfrac12 \mathbf h^T Q \mathbf h + \mathbf x^T \mathbf b + \mathbf h^T \mathbf b \\&= (\tfrac12 \mathbf x^T Q \mathbf x + \mathbf x^T \mathbf b) + \mathbf (\mathbf x^T Q + \mathbf b^T)\mathbf h + \tfrac12 \mathbf h^T Q \mathbf h + 0 \end{align} $$

so $Q$ is indeed the Hessian of your $f$ everywhere -- not because of Proposition 5, but because of how $f$ is defined in the first place.