Let $X=\{X_n\}$ be a finite state Markov Chain with the state space $S = \{0,1,2,...,N\}$ such that all the states are transient. The following is the transition matrix.
$$ P = \left[\begin{matrix} p_{00} & p_{01} & ... & p_{0N} \\ p_{10} & p_{11} & ... & p_{1N} \\ \vdots & \vdots & ... & \vdots \\p_{N0} & p_{N1} & ... & p_{NN} \end{matrix} \right] $$
Let $s_{ij}$ be the expected number of time periods the MC is in state $j$, given that it starts in state $i$. Then in most of the text books they directly say by conditioning on the initial transition, we get:
$$ s_{ij} = \delta_{ij} +\sum_{k=0}^{N} p_{ik}s_{kj} $$
In this derivation, what I understand is that:
$$ \begin{eqnarray} s_{ij} & = & E[\text{time periods spent in state j} | X_0=i]\\ & = & \sum_{k=0}^{N} E[\text{time periods spent in state j} | X_1=k,X_0=i] P(X_1=k|X_0=i)\\ & = & \sum_{k=0}^{N} E[\text{time periods spent in state j} | X_1=k] P(X_1=k|X_0=i)\\ & = &\sum_{k=0}^{N} p_{ik}s_{kj} \end{eqnarray} $$ So, where my argument is wrong and how we are getting $\delta_{ij}$ in the formula of $s_{ij}$?
Some remarks: