Let $\mu$ be a probability distribution over $\mathbb{R}^n$. All functions discussed henceforth are from $\mathbb{R}^n$ to $\mathbb{R}$. Let $l^\ast$ be a linear function and $f$ be a function such $f=l^\ast$ on a set that contains strictly larger than 50% fraction of the probability mass. Show that $l^\ast$ is a minimizer of the $L_1(\mu)$ error $\|f-l\|_{L_1(\mu)}$ over all linear functions $l$.
I tried to prove this for specific $f$'s to get intuition. But, even for specific $f$'s, there is a lot of cases to handle. It didn't look nice at all, and I decided to give up.
I looked at things surrounding the area "robust linear regression" on the internet, but couldn't find an explanation for this problem.
I tried to find inspiration from the various proofs that median minimizes $\ell_1$ error of some finite set of reals. In this direction, I tried to find the point at which gradient of the objective function is equal to zero. Here too, I got to a point where there was a large number of cases to handle, and decided to give up.
counter-example: In my notation $\mathcal L_{[a,b]}$ is the Lebesgue measure in $[a,b]$. Lets consider $\mu=\frac{\mathcal L_{[0,2]}+\mathcal L_{[3,4]}}{3}$ and define: $$f(x)=\begin{cases}x \;\text{ if } x\in[3,4] \\0 \;\;\text{ otherwise } \end{cases} $$ so in this situation we have that $l^*\equiv0$ (because $f_{|[0,2]}=0$ and $\mu([0,2])=\frac{2}{3})$. If $l(x)=x$ then: $$||f-l^*||=\frac{1}{3}\int_3^4 xdx=\frac{1}{3}\frac{7}{2}> $$ $$>||f-l||=\frac{1}{3}\int_0^2xdx =\frac{1}{3}2. $$
edit: There's a much simple counter-example, let $\mu=\mathcal L_{[0,2]}$ and $\varepsilon>0$. Be $f$ a function s.t.: $$f=\begin{cases}x\;\; x\in[1+\varepsilon,2]\\0 \;\;\text{otherwise}\end{cases}$$ hence if $l^*\equiv0,\;l(x)=x$ then: $$||f-l^*||=\int_{1+\varepsilon}^2 xdx=2-\frac{(1+\varepsilon)^2}{2}>$$ $$>||f-l||=\int_0^{1+\varepsilon}xdx=\frac{(1+\varepsilon)^2}{2}, $$ which is true for $\varepsilon$ small enough.