I have trouble understanding the notation in exercise 7.11.31:
I don't understand what $\lambda(\textbf{P})$ is. Lets take the factor $f_{X_0}$. My current understanding is as follows. We know that $\mathbb{P}(X_n=i)$ and $\mathbb{P}(X_n(\omega)=i)$ are shorthand for $\mathbb{P}(\{\omega\mid X_n(\omega)=i\})$. Based on this I introduce another shorthand notation $\mathbb{P}_\omega(X_n(\omega) = i) = \mathbb{P}(X_n = i)$. Using this notation I interpret $$ f_{X_0} = f_{X_0}(\beta) = \mathbb{P}_{\alpha}(X_0(\alpha)=X_0(\beta)), $$ which itself is a random variable having the probability mass function $$ \mathbb{P}_\beta(\mathbb{P}_{\alpha}(X_0(\alpha)=X_0(\beta)) = p) $$
Similarly, if I pick the factor $p_{X_0,X_1}$ the I interpret $$ p_{X_0,X_1} = p_{X_0,X_1}(\beta) = \mathbb{P}_\alpha(X_{1}(\alpha) = X_1(\beta) \mid X_n(\alpha) = X_n(\beta)), $$ which is a random variable having probability mass function $$ \mathbb{P}_\beta(\mathbb{P}_\alpha(X_{1}(\alpha) = X_1(\beta) \mid X_0(\alpha) = X_0(\beta)) = p). $$ If this is correct then how do I interpret the random variables and their respective probability mass functions?

You should interpret an expression such as $$ \log (f_{X_0}p_{X_0,X_1}p_{X_1,X_2})\tag1 $$ as the function $$ h(i, j, k):= \log(f_ip_{i,j}p_{j,k})\tag2 $$ evaluated at $i=X_0, j=X_1, k=X_2$. In other words, take the function $h$ defined in (2), which is a function of three integer inputs that returns a real-valued output, and plug in $X_0, X_1, X_2$ in place of those inputs. The result is $h(X_0, X_1, X_2)$, a random variable, which we write in the form (1).
Analogously the log-likelihood function $\lambda({\bf P})$ that involves $X_0, X_1,\ldots,X_n$ is obtained by plugging in $X_0, X_1,\ldots , X_n$ into a function that takes $n+1$ integer inputs.
I'm not sure where the authors are headed with this, but if the goal is to estimate the matrix $\bf P$ of transition probabilities, the typical setup is to observe the Markov chain for a while, obtaining observed values $x_0, x_1, \ldots, x_n$. These are plugged into the log-likelihood, then the log-likelihood is manipulated to obtain plausible estimates of the transition probabilities. For the duration of this exercise we imagine the observed values are frozen, and therefore treated as constants, as if we are manipulating form (2) (the deterministic version) instead of form (1) (the random version). In reality the end result (the estimated transition probabilities) are a function of the data, and can be considered random variables, so they are estimators in the statistical sense, and we can study properties of these estimators such as expectation and variance.
It is true that (1) is a random variable and each of the factors $f_{X_0}$, $p_{X_0,X_1}$, $p_{X_1, X_2}$ is as well. However, the probability mass functions of these random variables are not relevant to the manipulations that lead to the estimates of the transition probabilities.