Considering the Orthogonal Matching Pursuit algorithm to solve $$\min_x \|y-Dx\|_2^2\\ s.t:~ \|x\|_0<T$$ In which elements of $x$ will be chosen separately and based on the maximum decrease each can make to the residual at each time step.
Theoretically the solution is not necessary globally optimal. For example, using a brute force search there might be another combination of elements of $x$ which result in a better optimum point having the same $\|x\|_0$.
I like to know why the approximation is relevantly good while theoretically it is not the best solution?