How to compute the hessian of the log barrier function

2.3k Views Asked by At

According to note, the log barrier function is given by (page 10):

$f(x) = - \sum\limits_{i = 1}^m \log(b_i - a_i^Tx)$ where $b_i$ is a scalar, $a_i, x$ are $n$ dimensional vectors

I have successfully computed the gradient to be

$\nabla f(x) = A^Td$, where $d_i = \dfrac{1}{b_i-a_i^Tx}$ (pg 11)

I am trying to compute the Hessian but I cannot figure out why the final solution is $A^T diag(d)^2 A$, where did the diag came from?

Recall the hessian is given by $\nabla^2 f(x) = [\dfrac{\partial^2 f(x)}{\partial x_i \partial x_j}]$

I am certain that the coefficients are not zero for off-diagonal entries...since I am literally taking the derivative of the inner affine function $b_i-a_i^Tx$

Can someone explain why the final solution has a diagonal form?

1

There are 1 best solutions below

0
On BEST ANSWER

There are a couple of identities involving the Hadamard ($\circ$) product that you'll need in order to see how the Diag() operation arises.

Let
$\,\,\,\,x,y$ = arbitrary vectors
$\,\,\,\,1$ = vector of all ones
$\,\,\,\,{\rm Diag}(x)$ = function which returns a diagonal matrix whose diagonal equals the vector $x$

Then $$\eqalign{ {\rm Diag}(x)\,y &= x\circ y \cr &= y\circ x \cr }$$ The last relation is due to the fact that the Hadamard product is commutative.
Also, since we'll be talking about differentials, let's denote your vector $d$, by the vector $w$ instead.

Since you already know that the gradient of $f$ is $$ g = A^Tw $$ we'll start from there.

For convenience, define the variables $$\eqalign{ z &= Ax \cr y &= b - z \cr w &= 1/y \cr W &= {\rm Diag}(w) \cr }$$ The expression for $w$ uses Hadamard division, so we could express the relationship between $y$ and $w$ as $$\eqalign{ 1 &= w\circ y \cr 0 &= d(w\circ y) = w\circ dy + y\circ dw \cr dw &= -w\circ w\circ dy \cr }$$ Now expand the differential of the gradient, substituting variables step-by-step $$\eqalign{ dg &= A^T dw \cr &= A^T (-w\circ w\circ dy) \cr &= A^T (w\circ w\circ dz) \cr &= A^T (W W dz) \cr &= A^T (W W A\,dx) \cr &= A^T W^2 A\,dx \cr }$$ And so we arrive at an expression for the Hessian as $$ A^T W^2 A $$