I have tried to prove the following statement but after days of trying I couldn't!
Suppose $\mathbf{X} \in \mathbb{R} ^ {N\times d}, \mathbf{y} \in \mathbb{R}^N$ and $\alpha > 0$. Show that for an arbitrary norm, if the solution of following optimization problem is unique, it's at most $N$-sparse.
$$\theta^* = \textrm{argmin}_{\theta \in R^d} \; ||\theta||_1 \; s.t. \; ||X\theta - y|| \leq \alpha$$
I have tried to prove it by contradiction. Suppose the unique solution has more than $N$ nonzero elements. $X\theta$ is the sum of columns of $X$ where corresponding $\theta$ element is non-zero. We know that more than $N$ vectors of length $N$ are not linearly independent, but I couldn't do anything with it!