strong convexity of loss function in multi-dimensional (high-dimensional) space

315 Views Asked by At

My question is based on this paper (see the last 10 rows in page 7). It seems this is a general claim:

In machine learning or statistic, the loss function $l(W^TX, y)$ (a linear predictor) can never be strongly convex in a multi-dimensional space, even if $l(\cdot)$ is strongly convex, since it is flat in directions orthogonal to $X$.

How to get the meaning of this claim?

1

There are 1 best solutions below

0
On

In high dimensions, $X^T$ would be rank deficient and have a non-trivial kernel, ensuring the existence of $\Delta W \in null(X^T)-\{0\}$. Now plug this $\Delta W$ in Michael's illustration, and we're done.