Gradient & Hessian for log sigmoid function

73 Views Asked by At

I have a log sigmoid loss function, $$ l(\textbf w) = \frac{-1}{n}\sum{\log(\sigma(y_i\textbf w^T\textbf x_i))} $$ where $y_i$ is the class label which could be 1 or -1, $\textbf w^T$ is the vector for parameters, and $\textbf x$ is training data of size $n \times d$. May I know how do I derive its gradient and hessian? I know that the hessian could be expressed in a form of $X^TDX$, where $D$ is a diagonal matrix, with the diagonal entries being positive, hence proved its positive definiteness, but I have no idea how to arrive at these conclusion.

1

There are 1 best solutions below

0
On

Apply the chain rule and recall that $w^Tx$ is linear in $w$, so what does that tell you about the gradient?