I'm currently learning machine learning logistic regression and I'm really confused with logistic loss. I know the formula for logistic loss is $\displaystyle g(z)=\frac{1}{1+e^{-z}}$ here $z = W^TX_i$
But I've read that this is a non convex function. So we usually take logarithm of logistic loss to make it a convex function for optimization techniques like gradient descent etc..
$$J(\theta) =-\frac{1}{m}\sum_{i=1}^{m}y^{i}\log(h_\theta(x^{i}))+(1-y^{i})\log(1-h_\theta(x^{i}))$$ $$h_\theta(x^i) = \frac{1}{1+e^{-z}}$$
How can we know if a functions/loss is convex or non convex? Why are we taking log for logistic loss? How can that makes the loss function convex?
$g$ is not convex: consider the points $0,1,N$ where $N$ is a positive integer $>1$. We have $1=\frac 1 N (N)+(1-\frac 1 N) 0$. If $g$ is convex then we would have $g(1) \leq \frac 1 N g(N)+(1-\frac 1 N)g(0)$. If you let $N \to \infty$ in this you get the contradiction that $e \leq 1$.
$\log g$ is also not convex. It is concave. So $-\log\, g$ is convex. To see this write $-\log\, g$ as $-\log \, \frac {e^{x}} {1+e^{x}}=-x +\log \, (1+e^{x})$. it is easy to calculate the second derivative of the function and show that it is positive.