Find the geometry of the curves of the contour lines of $f(x) = \frac{1}{2}x^tAx + b^tx + c$

917 Views Asked by At

Find the geometry of the curves of the contour lines of a quadratic function $$f(x) = \frac{1}{2}x^tAx + b^tx + c$$ where $A \in \mathbb{R}^{2 \times 2}$, $b\in \mathbb{R}^2$ and $c\in \mathbb{R}$ in the following cases:

  • $A>0$

  • $A\ge 0$ and there exists $x$ such that $Ax+b = 0$

  • $A\ge 0$ and there is no $x$ such that $Ax + b = 0$

  • $A$ is undefined and non-singular.

I suppose $^t$ is the transpose. What is $A>0$?

I'm trying to develop a technique to see this. If we write $A = \begin{bmatrix} a & b \\ b & c\\ \end{bmatrix}$ then the function becomes

$$f((x_1,x_2)) = \frac{1}{2}\begin{bmatrix} x_1 & x_2 \end{bmatrix}\begin{bmatrix} a & b \\ b & c\\ \end{bmatrix}\begin{bmatrix} x_1 \\ x_2\\ \end{bmatrix} + \begin{bmatrix} b_1 & b_2 \end{bmatrix}\begin{bmatrix} x_1 \\ x_2\\ \end{bmatrix} + d = \\ \frac{1}{2}(ax_1^2 + 2bx_1x_2 + cx_2^2) + b_1x_1 + b_2x_2 + d $$

I don't know if I can simplify $(ax_1^2 + 2bx_1x_2 + cx_2^2)$. I think not. Maybe it's a shape of its own and I should recognize. I thin I can see it as approximate $x^2 + y^2$ everywhere since they grow much faster than $xy$. So the contour here would be circles? Is there a more accurate way of drawing the contour? I cannot suppose they're circles, I need to find what they truly are.

Anyways, $A>0$ implies that $\frac{1}{2}(ax_1^2 + 2bx_1x_2 + cx_2^2)>0$ right?

And $A\ge 0 \implies \frac{1}{2}(ax_1^2 + 2bx_1x_2 + cx_2^2)\ge 0$, and the condition there exists $x$ such that $Ax+b=0$ means there exists $x$ such that $\begin{bmatrix} ax_1 + bx_2 + b_1 \\ ax_2 + bx_1 + b_2\\ \end{bmatrix} = \begin{bmatrix} 0 \\ 0\\ \end{bmatrix}\implies ax_1 + bx_2 + b_1 = 0, ax_2 + bx_1 + b_2 = 0$

What this should tell me?

$A$ being undefined and non singular means an arbitrary nonsingular matrix, I suppose. So the invertibility or the determinant of $A$ plays a role here.

1

There are 1 best solutions below

9
On BEST ANSWER

In the optimization community, $A > 0$ (where $A$ is a matrix), or more commonly written as $A \succ 0$, usually means $A$ is positive definite. Also, $A \succeq 0 $ means positive semidefinite.

Notice that $\nabla f(x) = Ax+b$ and $\nabla^2 f(x) = A$.

If $A \succ 0$, we have a strongly convex quadratic function. The curves should be ellipses. They can be very elongated if the condition number is large.

If $A \succeq 0$ but $A \not \succ 0$, then at least one of the eigenvalues is $0$.

  • If there is an $x$ such that $\nabla f(x)=0$, there are multiple solutions to the system $$\nabla f(x)=Ax+b=0$$ It is like a parabolic sheet with multiple minimum points. It is also possible for the function to be constant everywhere.

  • Suppose there is no $x$ such that $\nabla f(x)=0$, for example when $A=0, b \ne 0$, there is no minimum point, the problem is unbounded.

The most ambiguous part would be the word "undefined". I would guess it to be indefinite. If we have a positive eigenvalues and a negative eigenvalue, we have a saddle point.

Edit:

Usually, positive definite and positive semidefinite matrices refer to the case where the matrices are symmetric and we know that there exist orthogonal matrices $U$ such that $A=UDU^T$ where $D=\mbox{diag}(\lambda_1, \lambda_2)$ is a diagonal matrix which consists of the eigenvalues.

$$f(x) = x^TAx+b^Tx+c=x^TUDU^Tx+b^TUU^Tx + c$$

Let $y= U^Tx$ and $p = U^Tb$. We might want to study

\begin{align}g(y)&=y^TDy+p^Ty+c \\ &= \sum_{i=1}^2 (\lambda_i y_i^2 + p_i y_i)+c\end{align}

If $A \succ 0 $, then all the eigenvalues are positive.

\begin{align}g(y) &= \sum_{i=1}^2 \lambda_i \left( y_i^2 + \frac{p_i}{\lambda_i} y_i\right)+c \\ &= \sum_{i=1}^2 \left[ \lambda_i\left( y_i + \frac{p_i}{2\lambda_i} \right)^2 - \lambda_i\left( \frac{p_i^2}{4\lambda_i^2}\right)^2\right]+c\end{align}

Let $z_i = y_i + \frac{p_i}{2\lambda_i}$ and $q = c - \sum_{i=1}^2\lambda_i\left( \frac{p_i^2}{4\lambda_i^2}\right)^2$

then we have $$h(z) = \sum_{i=1}^2 \lambda_i z_i^2 + q$$

It should be easy to recognize the contour of $h$ should be consists of ellipses. From variable $y$ to $z$ what happens was a translation. From variable $x$ to $y$, the transformation is an orthogonal matrix multiplication, it is just a rotation / reflection. Hence $f$ contours consists of ellipses.

enter image description here

Similar analysis can be repeated as long as none of the $\lambda_i$ are zero.

In the indefinite and non-singular case, one of the eigenvalue is positive and one of the eigenvalue is positive. By the same argument, we notice that the contour lines consists of hyperbola.

enter image description here

If $A=0$ and $b=0$ then $f$ is a constant.

If $A=0$ and $b \ne 0$, then $f$ is linear and the contour consists of parallel straight lines.

If we consider the case where $A$ is non-zero, positive semidefinite but not positive definite, then one of the eigenvalue, $\lambda_1$ is positive and the other is $\lambda_2=0$.

Then $$g(y) = \lambda_1y_1^2+p_1y_1+p_2y_2+c$$

Also, $Ax+b=0$ become $UDU^Tx + b=0$, which is equivalent to $DU^Tx +U^Tb=0$ which is $\lambda_1 y_1 + p_1 =0$ and $0=p_2$. Hence whether the system $Ax+b=0$ is consistent and only if $p_2=0$.

If $Ax+b=0$ is consistent, that is if $p_2=0$, then our function $g$ is $$g(y) = \lambda_1y_1^2+p_1y_1+c$$

It is independent of $y_2$ and notice that it has multiple global minimum of the form of $(y_1^*, y_2)$. The contour lines are straight lines parallel to the $y_2$-axis.

If $Ax+b=0$ is not consistent, that is if $p_2 \ne 0$, then our function is $$g(y) = \lambda_1y_1^2+p_1y_1+p_2y_2+c$$

and since $p_2 \ne 0$, the contour lines consists of parabolas.

enter image description here

Picture source can be found here