Differentiating a simplified version of logistic loss

73 Views Asked by At

I am taking a machine learning course by Andrew Ng on Coursera, and I'm currently learning about Logistic regression. The cost function is

$\operatorname{cost}\left(h_{\theta}(x), y\right)=-y \log \left(h_{\theta}(x)\right)-(1-y) \log \left(1-h_{\theta}(x)\right)$, where $h_{\theta}(x)$ is a hypothesis function.

So, $ h_{\theta}(x)=g\left(\theta^{T} x\right)$, where $g(z)=\frac{1}{1+e^{-z}}$

I am trying to solve a very simplified version inorder to get a better understanding of its derivative. I'm finding

$\cssId{diff-var-order-mathjax}{\tfrac{\mathrm{d}}{\mathrm{d}\theta}}\left[{y\log\left({h_{\theta}(x)}\right)+\left(1-y\right)\log\left(1-h_{\theta}(x)\right)}\right]$

$ = \cssId{diff-var-order-mathjax}{\tfrac{\mathrm{d}}{\mathrm{d}\theta}}\left[{y\ln\left(\dfrac{1}{1+\mathrm{e}^{-\theta x}}\right)+\left(1-y\right)\ln\left(1-\dfrac{1}{1+\mathrm{e}^{-\theta x}}\right)}\right]$.

Till now, I have arrived at

$x(\frac{ye^{-\theta x}-1+y}{1+e^{-\theta x}})$, which does not seem to be the right answer. The course says that the partial derivative of the cost function w.r.t. $\theta_j = \frac{1}{m} \sum_{i=1}^{m}\left(h_{\theta}\left(x^{(i)}\right)-y^{(i)}\right) x_{j}^{(i)}$.

Where am I going wrong?

1

There are 1 best solutions below

0
On BEST ANSWER

Just group together the terms with $y$. You'll get: $$x(\frac{ye^{-\theta x}-1+y}{1+e^{-\theta x}})=x(y-\frac1{1+e^{-\theta x}})=x(y-h_\theta(x))$$ The minus sign is due to the fact that you changed the sign of of the expression when you took the first derivative. You also assumed $m=1$, and therefore you have $\frac d{d\theta}$. Otherwise use $$\theta^Tx=\sum_{i=1}^m \theta_i x_i$$ and take partial derivatives $\frac{\partial}{\partial\theta_i}$.