Given matrix $A \in \mathbb{R}^{m \times n}$, vector $y \in \mathbb{R}^m$, and scalar $ \eta \geq 0$,
$$\begin{array}{ll} \underset{x \in \mathbb{R}^n}{\text{minimize}} & \|x\|_1\\ \text{subject to} & \|Ax - y\| \leq \eta\end{array}$$
It should be assumed here that the norm in the constraint is any proper norm in $\mathbb{R}^m$. Let $x^*$ denote the unique minimizer. I want to show that $x^*$ is $m$-sparse, that is
$$|\text{supp}(x^*)| \leq m$$
My understanding is that the proof given for Theorem 3.1 in Simon Foucart & Holger Rauhut's A Mathematical Introduction to Compressive Sensing [PDF – 6.6 MB] should work. It is a proof of this fact for basis pursuit, i.e., $Ax=y$ instead of $\|Ax - y\| \leq \eta$.
My questions are:
Does the proof which I linked above indeed cover this case as well?
Are there any other interesting proofs of this fact which use convex optimization or more general results from optimization?