I would like to estimate a matrix $S$ by solving the following optimization problem
\begin{align} &\min\limits_{s} f(S) \\ &\text{subject to }\sum_{i,j}s_{i,j}=1,\quad s_{i,j}\geq0~\forall(i,j)\end{align}
where function $f$ is convex and the entries of matrix $S$ sum up to $1$ and are non-negative.
So I am solving the problem by the projected gradient descent using the following iterative equations at every iteration $k+1$:
\begin{align} &s_{i,j}^{(k+1)}= \max\Big(0,s_{i,j}^{(k)} - \eta\nabla f\big(s_{i,j}^{(k)}\big)\Big)\\ &s^{(k+1)}_{i,j} = \frac{s^{(k+1)}_{i,j}}{\sum_{i,j}s^{(k+1)}_{i,j}}\end{align}
Apparently what I am doing is not correct so I am wondering how I should write my iterative equations for the projection properly.
EDIT:
My solution may lead to dividing the entries $s_{i,k}^{(k+1)}$'s by $0$ . I thought about zeroing out negative entries after the normalization of $s_{i,k}^{(k+1)}$'s but if I do so I won't have anymore $\sum_{i,j}s_{i,j}=1$.

Unfortunately what you're trying to do is not correct as you can see in the following figure:
You are taking the projection of the gradient step on $\{x: x\geq 0\}$ and then you are taking a second projection on $E=\{x:\sum_i x_i = 1\}$, that is, you end up with $P_E(P_+(x))$ which is different from $P_{E\cup +}(x)$. I am not sure whether what you are using will converge to an optimal point.
Judging from the above figure, it might be best to first project on $E$ and then on the positive quartile (that is $P_{E\cup +}(\cdot) = P_E(P_+(\cdot))$, but sketches are other deceptive, so you need to prove it. For sure, what you are currently using does not correspond to the projected gradient method.
There are several things you can do however. You can dualise either the constraint $x\geq 0$ or $\sum_i x_i = 1$, or use three term splittings, or use ADMM.