About the closest linear function to an arbitrary function in L1 norm

235 Views Asked by At

Let $\mu$ be a probability distribution over $\mathbb{R}^n$. All functions discussed henceforth are from $\mathbb{R}^n$ to $\mathbb{R}$. Let $l^\ast$ be a linear function and $f$ be a function such $f=l^\ast$ on a set that contains strictly larger than 50% fraction of the probability mass. Show that $l^\ast$ is a minimizer of the $L_1(\mu)$ error $\|f-l\|_{L_1(\mu)}$ over all linear functions $l$.

I tried to prove this for specific $f$'s to get intuition. But, even for specific $f$'s, there is a lot of cases to handle. It didn't look nice at all, and I decided to give up.

I looked at things surrounding the area "robust linear regression" on the internet, but couldn't find an explanation for this problem.

I tried to find inspiration from the various proofs that median minimizes $\ell_1$ error of some finite set of reals. In this direction, I tried to find the point at which gradient of the objective function is equal to zero. Here too, I got to a point where there was a large number of cases to handle, and decided to give up.

3

There are 3 best solutions below

3
On BEST ANSWER

counter-example: In my notation $\mathcal L_{[a,b]}$ is the Lebesgue measure in $[a,b]$. Lets consider $\mu=\frac{\mathcal L_{[0,2]}+\mathcal L_{[3,4]}}{3}$ and define: $$f(x)=\begin{cases}x \;\text{ if } x\in[3,4] \\0 \;\;\text{ otherwise } \end{cases} $$ so in this situation we have that $l^*\equiv0$ (because $f_{|[0,2]}=0$ and $\mu([0,2])=\frac{2}{3})$. If $l(x)=x$ then: $$||f-l^*||=\frac{1}{3}\int_3^4 xdx=\frac{1}{3}\frac{7}{2}> $$ $$>||f-l||=\frac{1}{3}\int_0^2xdx =\frac{1}{3}2. $$

edit: There's a much simple counter-example, let $\mu=\mathcal L_{[0,2]}$ and $\varepsilon>0$. Be $f$ a function s.t.: $$f=\begin{cases}x\;\; x\in[1+\varepsilon,2]\\0 \;\;\text{otherwise}\end{cases}$$ hence if $l^*\equiv0,\;l(x)=x$ then: $$||f-l^*||=\int_{1+\varepsilon}^2 xdx=2-\frac{(1+\varepsilon)^2}{2}>$$ $$>||f-l||=\int_0^{1+\varepsilon}xdx=\frac{(1+\varepsilon)^2}{2}, $$ which is true for $\varepsilon$ small enough.

0
On

Disclaimer: this is NOT an answer but the track I have been following till I somehow got also stuck and would be interested in a possible conclusion from here. I allowed myself to post it as I think I made significant progress in the right direction (hopefully) so someone could finish the job. Please let me know if this is inapropriate and I will delete.

Let me rewrite the problem to set some notations. We want to show that

$$\arg \min_{l \text{ linear}} \|f - l\|_{L^1(\mu)} = l^*$$

where $l^*=f$ on $A \subset \mathbb{R}^n$ where $\mu(A) = \int_A d \mu >1/2$. We have:

$$\|f - l\|_{L^1(\mu)} = \int |f-l|d \mu = \int_A |f-l|d \mu+\int_{\bar{A}} |f-l|d \mu.$$

Let $l$ such that $\|f - l\|_{L^1(\mu)} \le \|f - l^*\|_{L^1(\mu)}$. Then, since $f=l^*$ on $A$:

$$\int_A |f-l|d \mu+\int_{\bar{A}} |f-l|d\mu \le \int_{\bar{A}} |f-l^*|d \mu$$

Rearranging and using the second triangle inequality yields:

$$\int_A |f-l|d \mu \le \int_{\bar{A}} |f-l^*|-|f-l|d \mu \le \int_{\bar{A}} ||f-l^*|-|f-l||d \mu \le \int_{\bar{A}} |l^*-l|d \mu$$

But, once more, $f=l^*$ on $A$, so we must have: $$\int_A |m|d \mu \le \int_{\bar{A}} |m|d \mu$$

where $m=l^*-l$, the linear map produced by the difference. Now I think we should be able to use the linearity of $m$ and the fact that $\mu(A)>1/2$ to conclude but failed to do so till now.

4
On

I found a counter-example while trying to extend nicomezi's argument.

Take $n = 1$, $\mu$ to be the uniform distribution over $[-1,1]$, $l^\ast = 0$ and $l = 1 + x$. Define $f$ to be equal to $l^\ast$ in $[-1,\epsilon]$ and equal to $l$ in $[\epsilon, 1]$ for some small enough $\epsilon$ > 0.

Now $\|f-l^\ast\|_{L_1(\mu)}$ is close to 3/2 while $\|f-l\|_{L_1(\mu)}$ is close to 1/2.