I wonder if the Loss function of a Logistic regression can have strong convexity when the explanatory variables are linearly independent. From a theoretical point of view, if I have a sample of p variables and n observations with the Logistic Regression. I just summarize the key results here. The Loss function for the case $Y \in \{-1,1\}$ (which is minus the Log-likelihood function)
$\displaystyle LL(\theta) = \sum_{i=1}^n \log \left( 1 + \exp(-Y_i \theta^T X_i ) \right)$
The Hessian is :
$\displaystyle \nabla^2LL(\theta) = X^TAX $ where A is a diagonal matrix with $0<A_{ii} \leq 1/4$.
We know that the Hessian is a Gram matrix, which is by definition positive semi-definite. With the assumption that $X$ has full rank, the Gram matrix is positive definite (All the eigenvalues > 0)
However, with these assumptions, can we go further and conclude that the Loss function is strongly convex when the explanatory variables matrix has full rank? If not, what do we miss here?
To prove that your function is strongly convex with parameter $\sigma$, we must show that $X^TAX - \sigma I$ is positive semidefinite, which is equivalent to saying $$w^TX^TAXw \geq \sigma w^Tw$$ for any $w$. We can set the constraint $||w||=1$ and since $A$ is positive definite, we can set $\tilde{A}=\sqrt{A}$ to get $$w^T(\tilde{A}X)^T\tilde{A}Xw \geq \sigma,$$ which holds true for any $\sigma$ less than the smallest singular value of $(\tilde{A}X)$.