My notes on Markov chain long-term behaviour present the following definition:
Definition 10.
For any state $j$, define $N_j$ to be the number of times the Markov chain is at state $j$, excluding the initial state. Formally, $N_j = \# \{ n \ge 1 : X_n = j \}$.
Note that $N_j$ takes the values in $\mathbb{N} \cup \{ 0 \} \cup \{ \infty \}$.
Note also that
$$P(N_j = m \vert X_0 = i) = f_{ij} f_{jj}^{m - 1}(1 - f_{jj})$$
and
$$P(N_j \ge m \vert X_0 = i) = f_{ij} f_{jj}^{m - 1}.$$
It follows that
$$P(N_j = \infty \vert X_0 = i) = \begin{cases} 0 & \text{if} \ f_{jj} < 1, \\ f_{ij} & \text{if} \ f_{jj} = 1. \end{cases}$$
So $P(N_j = \infty \vert X_0 = i) = 0$ if $j$ is transient, and $f_{ij}$ if $j$ is recurrent.
I have spent a significant amount of time reading over these notes (see below) to try and gain a good mathematical and intuitive understanding for the equations $P(N_j = m \vert X_0 = i) = f_{ij} f_{jj}^{m - 1}(1 - f_{jj})$, $P(N_j \ge m \vert X_0 = i) = f_{ij} f_{jj}^{m - 1}$, and $P(N_j = \infty \vert X_0 = i) = \begin{cases} 0 & \text{if} \ f_{jj} < 1, \\ f_{ij} & \text{if} \ f_{jj} = 1 \end{cases}$, but I am having great difficulty doing so. The way these concepts are explained in my notes make it very difficult for me to develop a clear understanding of this. For the first two equations in particular, I want to understand the presence of the terms in $f_{ij} f_{jj}^{m - 1}(1 - f_{jj})$ and $f_{ij} f_{jj}^{m - 1}$, as explained/introduced in the notes (see below).
The full notes are as follows:
Definition 7.
For a non-empty subset $A$ of the state space, define
$$T_A = \min\{ n \ge 1 : X_n \in A \}.$$
$T_A$ is called the hitting time of the time of first passage.
If $X_n \not\in A$ for all $n \ge 1$, then $T_A = \infty$.
For a state $i$, $T_{\{i\}}$ is written as $T_i$.
Theorem 10.
$$p^{(n)}_{ij} = \sum_{m = 1}^n P(T_j = m \vert X_0 = i) p^{(n - m)}_{jj}.$$
Definition 8.
For states $i$ and $j$, define
$$f^{(n)}_{ij} = P(X_n = j, X_k \not= j, k = 1, 2, \dots, n - 1 \vert X_0 = i)$$
and $f_{ij} = \sum_{n = 1}^\infty f^{(n)}_{ij}$.
In other words, $f^{(n)}_{ij}$ is the probability that, starting from state $i$, the first visit to $j$ occurs at the $n$th transition.
$f^{(1)}_{ij} = p_{ij}$. And by convention, $f^{(0)}_{ij} = 0$, including for the case $i = j$.
$f_{ij}$ is the same as $P(T_j < \infty \vert X_0 = i)$, and is the probability of the Markov chain reaching state $j$ some time if it starts from state $i$.
Note that $P(T_j = \infty \vert X_0 = i)$ is $1 - f_{ij}$. In addition, $i$ leads to $j$ if and only if $f_{ij} > 0$.
Theorem 11.
$$p^{(n)}_{ii} = \sum_{k = 0}^n f_{ii}^{(k)}p^{(n - k)}_{ii}$$
Definition 9. A state $i$ is said to be recurrent if $f_{ii} = 1$, and transient if $f_{ii} < 1$.
If the state $i$ is recurrent, the probability that the Markov chain starting from $i$ returns to $i$ is $1$.
If $i$ is transient, there is a positive probability that the chain, starting from $i$, will never return to $i$.
All absorbing states are recurrent, because if $p_{ii} = 1$, then $f_{ii} = 1$.
Definition 10.
For any state $j$, define $N_j$ to be the number of times the Markov chain is at state $j$, excluding the initial state. Formally, $N_j = \# \{ n \ge 1 : X_n = j \}$.
Note that $N_j$ takes the values in $\mathbb{N} \cup \{ 0 \} \cup \{ \infty \}$.
Note also that
$$P(N_j = m \vert X_0 = i) = f_{ij} f_{jj}^{m - 1}(1 - f_{jj})$$
and
$$P(N_j \ge m \vert X_0 = i) = f_{ij} f_{jj}^{m - 1}.$$
It follows that
$$P(N_j = \infty \vert X_0 = i) = \begin{cases} 0 & \text{if} \ f_{jj} < 1, \\ f_{ij} & \text{if} \ f_{jj} = 1. \end{cases}$$
So $P(N_j = \infty \vert X_0 = i) = 0$ if $j$ is transient, and $f_{ij}$ if $j$ is recurrent.
I was wondering if people would please take the time to explain these three equations, both mathematically and intuitively, so that the ideas conveyed by them are clear to me.
Multiplying them together
\begin{align} P(N_j=m|X_0=i) = f_{ij}f_{jj}^{m-1}(1-f_{jj}) \end{align}
The argument is the same as earlier, that is
\begin{align} P(N_j\ge m|X_0=i) = f_{ij}f_{jj}^{m-1} \end{align}
We begins from state $i$, first we need to reach state $j$, and then we will visit state $j$ infinitely many times. However, if we are told that $f_{jj}<1$, then we know that $j$ is a transient state, hence after certain time, we will never visit state $j$ again, hence we can't have $N_j=\infty$.
A more mathematical claim is by taking the limit of the previous term:
$$\lim_{m \to \infty}P(N_j\ge m|X_0=i)=\lim_{m \to \infty}f_{ij}f_{jj}^{m-1}=f_{ij}\lim_{m \to \infty}f_{jj}^{m-1}=0$$
by geometric series.
We begin from state $i$, we need to visit state $j$, once we visit it, since state $j$ is a recurrence state, it will be visited infinitly often.
$$\lim_{m \to \infty}P(N_j\ge m|X_0=i)=\lim_{m \to \infty}f_{ij}f_{jj}^{m-1}=f_{ij}\lim_{m \to \infty}1^{m-1}=f_{ij}$$