How to calculate the gradient function of the logistic regression's cost function by using the directional derivative?

34 Views Asked by At

Let's assume we have a dataset $\Omega = \{ (x^k, y^k) \in \mathbb{R}^p \times \{0,1\} | 1 \leq k \leq n \}$. I would like to compute the gradient of the logistic regression's cost function by using the notion of directional derivative.

Let be $E(\omega) = -\sum_{k=1}^n y^k \log(\sigma(x^k \omega)) + (1-y^k)\log(1 - \sigma(x^k \omega))$. Given a vector $d$, we define the directional derivative as : $$ D(E,d) = \lim_{t \mapsto 0} \frac{E(\omega + td) - E(\omega)}{t} = \lim_{t \mapsto 0} \Delta_t(E,d) $$

Let be $t > 0$, we have : $$ \log(\sigma(x^k \omega)) = - \log(1+ e^{-x^k \omega}) $$ $$ \log(1 - \sigma(x^k \omega)) = - x^k \omega - \log(1+ e^{-x^k \omega}) $$ Hence, $$ E(\omega) = \sum_{k=1}^n (1-y^k)u^k\omega + \log(1+ e^{-x^k \omega}) $$ Now, when I try to calculate the directional derivative, I get : $$ \Delta_t(E, d) = \frac{1}{t}\sum_{k=1}^n (1-y^k)tx^k d + \log\frac{1+e^{-x^k(\omega + td)}}{1+e^{-x^k\omega}} $$ And I do not know how to get the following result : $$ D(E,d) = < \nabla(E) | d > $$ I think I did something wrong, so would you like to help me :) ? It would be great ! Many thanks,