Given a design matrix $X$, such that $XX^{T} = I_{m}$. Prove that the solution $w_{opt}^{BIC} = (w_{opt_{1}}^{BIC},w_{opt_{2}}^{BIC},..w_{opt_{p}}^{BIC})$ on the minimization problem $||Y-X^{T}w||^{2} + \lambda||w||_{0}$ is given by the hard threshold $w_{opt_{j}}^{BIC} = w^{'}_{j}.1 \hspace{2mm} |w^{'}_{j}| \ge \sqrt{\lambda}\hspace{2mm} 1\le j\le p$ where $w^{'} = XY$.
I tried to reduce the above minimization form to convex form and got the following.
For any set $S \subset \{1,..,p\}$ and any $w \hspace{0.5mm} \epsilon \mathbb{R}^{p} \hspace{1mm}$ let $w_{S} \hspace{0.5mm} \epsilon \mathbb{R}^{p}$ be a vector with components $w_{S,n} = w_{n}1(n \hspace{0.5mm} \epsilon S)$. Then we have
$ min_{w} \hspace{0.5mm}||Y-X^{T}w||^{2} + \lambda||w||_{0} = min_{S \subset \{1,..,p\}} \hspace{0.5mm} min_{w_{S}} \hspace{0.5mm} ||Y-X^{T}w_{S}||^{2} + \lambda|S|$ where $|S| = $ number of elements in set $S$.
Also here the design matrix $X$ is orthogonal due to $XX^{T}=I_{p}$. But am unable to proceed from the above expression as it involves the subset selection operation on set $S$.
Any hints on how to approach the same?