A spectraplex (special case of spectrahedron) is the set of all positive semi-definite matrices whose trace is equal to one. Formally, let
$$ S = \left\{\textbf{W} \in \mathbb{R}^{d \times d} \mid \textbf{W} \succeq 0, \text{Tr}(\textbf{W})=1 \right\} $$
On page 7, lemma 2 in Dan Garber's On the regret minimization of nonconvex online gradient ascent for online PCA, it is stated that "it is well known that the projection of $\textbf{W}$ onto $S$ is given by" $$ \Pi_S[\textbf{W}]=(\lambda_1(\textbf{W})-\lambda)w'w'^T+ \sum_{i=2}^{d}\max\{0,\lambda_i(\textbf{W})-\lambda \}y_iy_i^T $$
where $(w',y_2,y_3,\cdots,y_d)$ are eigenvectors of $\textbf{W}$ corresponding to descending eigenvalues such that
$$ \lambda_1(\textbf{W})-\lambda+ \sum_{i=2}^{d}\max\{0,\lambda_i(\textbf{W})-\lambda \}=1 $$ for some $\lambda \geq 0$. Where the above comes from? Is there any reference that explains such projections?
First, note that no such $\lambda$ exists if
$\sum_{i=1}^{d} \lambda_{i} < 1$.
For example, the method fails if $W=0$ or if $W=(1/(2d))I$. It appears that in the context where this is presented in the paper that you cited we can be sure that the sum of the $\lambda_{i}$ is large enough.
I believe that if you allow for negative values of $\lambda$ the method will actually produce correct results for matrices $W$ that have $\sum_{i=1}^{d} \lambda_{i}<1$.
In the following, we'll assume that
$\sum_{i=1}^{d} \lambda_{i} \geq 1$.
Since $W$ is symmetric, we can write it in terms of eigenvalues and normalized eigenvectors as
$W=\sum_{i=1}^{d} \lambda_{i} y^{(i)}{y^{(i)}}^{T}$
or
$W=\sum_{i=1}^{d} \lambda_{i} Y^{(i)}$
where
$Y^{(i)}=y^{(i)}{y^{(i)}}^{T}$.
We need to be specific about the matrix inner product and norm that we're projecting with respect to. Since $W$ lives in the space of symmetric matrices, The appropriate inner product is
$\langle A, B \rangle = \mbox{tr}(AB)$
and the norm is
$\| A \|_{F}=\sqrt{\langle A, A \rangle}$.
Note that with respect to this inner product, the $Y^{(i)}$ matrices are orthogonal because the $y_{i}$ eigenvectors are orthogonal.
$\langle Y^{(i)}, Y^{(j)} \rangle=\mbox{tr}(y^{(i)}{y^{(i)}}^{T}y^{(j)}{y^{(j)}}^{T})$
$\langle Y^({i)}, Y^{(j)} \rangle=\mbox{tr}(y^{(i)}0{y^{(j)}}^{T})=0$.
Also,
$\| Y^{(i)} \|_{F}^{2}=\mbox{tr}(Y^{(i)}Y^{(i)})= \mbox{tr} (y^{(i)}{y^{(i)}}^{T} y^{(i)}{y^{(i)}}^{T}) =\mbox{tr}(y^{(i)}{y^{(i)}}^{T}) =\mbox{tr}({y^{(i)}}^{T}y^{(i)}) =1$.
The desired projection is
$\Pi_{S}(W)=\mbox{arg} \mbox{min}_{Y} \| W-Y \|_{F}^{2}$
subject to the constraints
$\mbox{tr}(Y)=1$
$Y \succeq 0$.
Write $Y$ as a linear combination of the $Y^{(i)}$ matrices plus an additional component that is in the orthogonal complement of the span of the $Y^{(i)}$.
$Y=Y^{(0)}+\sum_{i=1}^{d} \alpha_{i} Y^{(i)}$
Now, by the Pythagorean theorem,
$\| W-Y \|_{F}^{2}=\left\| W-Y^{(0)} \right\|_{F}^{2}+\left\| \left( W- \sum_{i=1}^{d} \alpha_{i} Y^{(i)} \right) \right\|_{F}^{2}$.
$\| W-Y \|_{F}^{2}=\left\| W-Y^{(0)} \right\|_{F}^{2}+\left\| \sum_{i=1}^{d} (\lambda_{i}-\alpha_{i}) Y^{(i)} \right\|_{F}^{2}$.
Using the orthogonality of the $Y^{(i)}$ matrices, we get that
$\| W-Y \|_{F}^{2}=\left\| W-Y^{(0)} \right\|_{F}^{2}+\sum_{i=1}^{d} (\lambda_{i}-\alpha_{i})^{2}$.
Since $W \perp Y^{(0)}$,
$\| W-Y \|_{F}^{2}=\left\| W \right\|_{F}^{2}+\left\| Y^{(0)} \right\|_{F}^{2}+\sum_{i=1}^{d} (\lambda_{i}-\alpha_{i})^{2}$.
Since
$\mbox{tr}(Y)=\mbox{tr}(Y^{(0)})+\sum_{i=1}^{d} \alpha_{i}$,
and
$\sum_{i=1}^{d} \alpha_{i} \leq \sum_{i=1}^{d} \lambda_{i}$,
using a non-zero value of $Y^{(0)}$ can only increase this objective. Thus it's optimal to set $Y^{(0)}=0$.
We've now reduced the problem to
$\min \sum_{i=1}^{d} (\lambda_{i}-\alpha_{i})^{2}$
subject to
$\sum_{i=1}^{d} \alpha_{i}=1$
$\alpha \geq 0$.
It's easy to see that the equation given in the paper results from applying the KKT conditions to this problem.
If we let $\mu$ be the Lagrange multiplier for the constraint $\sum_{i=1}^{d}\alpha_{i}=1$ and $\nu_{i}$ be the Lagrange multiplier for the constraint $\alpha_{i} \geq 0$, then the ith component of the KKT conditions depends on whether $\alpha_{i}>0$, in which case $\nu_{i}=0$, or $\alpha_{i}=0$, in which case $\nu_{i}$ is free to be nonzero.
In the case $\alpha_{i}>0$ and $\nu_{i}=0$, we have
$-2(\lambda_{i}-\alpha_{i})=\mu$
or
$(\lambda_{i}-\alpha_{i})=\mu/2$.
In the case where $\alpha_{i}=0$ and $\nu_{i}$ is positive, the equation can satisfied by adjusting $\nu$. Next, we use the constraint
$\sum_{\alpha_{i}>0} \alpha_{i}=1$
to get
$\sum_{\alpha_{i}>0} (\lambda_{i}-\mu/2)=1$
Adding in the terms for $\alpha_{i}=0$, we get
$\sum_{i=1}^{d} \max(0,\lambda_{i}-\mu/2)=1$.
Once $\mu$ has been determined, let
$\alpha_{i}=\max(0,\lambda_{i}-\mu/2)$ for $i=1, 2, \ldots, d$.