I wanted to show that least squares is Lipschitz. Consider
$$ L(w) = \| Xw -y \|^2$$
its K-Lipschitz if:
$$ \| L(b) - L(a) \|^2 \leq K\| b- a \|^2$$
so I did:
$$ L(b) - L(a) = b^T X^T X b - a^T X^T X a - 2 y^T X b + 2 y^T X a$$
$$ = b^T X^T X b - a^T X^T X a - 2 y^T X (b-a) $$
now I want to factor a $b-a$ from $b^T X^T X b - a^T X^T X a$ but I didn't know how. Any ideas?
When thing are only numbers i.e. 1D its easy cuz we get $b^2 - a^2$:
$$ L(b) - L(a) = | (xb-y)^2 - (xa-y)^2 |$$ $$ = |x^2b^2 - x^2 a^2 + 2xby - 2xay | = | x^2 (b+a)(b-a) + 2xy(b-a)|$$ $$ = | x^2 (b+a)+ 2xy||b-a|$$
which shows its Lipschitz... but had difficulty generalizing this regression problem...it's obvious to me that its some property of linear algebra that I don't know or know how to conjure up...
This isn't true. For simplicity consider the case where $X$ is a scalar equal to $1$ and $y=0$. Then $(L(a)-L(b))^2=(a^2 - b^2 )^2=(a+b)^2(a-b)^2$. So for any $K$ if $(a+b)^2>K$ then $(L(a)-L(b))^2>K(a-b)^2$.
However, if you impose that $|a|$ and $|b|$ are bounded then you're fine. In that case suppose that we impose $||a||^2,||b||^2\leq L$ for some scalar $L$. Now, you can easily verify that (just use the associativity of matrix multiplication): $ a^T X^T Xa - b^T X^T Xb = (a-b)^T X^T Xb +b^T X^T X(a-b) + (a-b)^T X^T X (a-b) $ And so applying the triangle inequality you get:
$||L(a)-L(b)||^2\leq (2||b||^2||X^T X||^2+2||y^T X||^2+||a-b||^2 ||X^T X||^2)||a-b||^2$
Where the norm is the operator norm if the argument is a matrix. And so using the boundedness of $||a||^2$ and $||b||^2$ you get:
$||L(a)-L(b)||^2\leq (4L||X^T X||^2+2||y^T X||^2)||a-b||^2$
And so you can set $K=(4L||X^T X||^2+2||y^T X||^2)$