Let $\{X_t\}$ be a Markov chain on probability space $(\Omega, \mathcal{F}, \mathbb{P})$ with state space $(\chi, \mathbb{B}(\chi))$.
Let $\{\mathcal{F_t}\}$ be a filtration and let $\{X_t\}$ be adapted to $\{\mathcal{F_t}\}$.
Then in the book "Markov Chains and Mixing Times" by Levin, Peres, etc. the definition of a Markov chain $\{X_t\}$ wrt $\{\mathcal{F}_t\}$ is defined as $$\mathbb{P}_x( X_{t+1} = y \:| \: \mathcal{F_t}) = P(X_t, y)$$ where $P$ is the transition matrix.
I am having trouble understanding the left and right hand terms of the above equation.
(i) I am guessing $P(X_t, y) = \sum_{x \in \chi} P(x,y) * \mathbb{P}(X_t = x)$ ?
(ii) I read the section where they defined $\mathbb{E}(X | \mathcal{G})$ where $\mathcal{G}$ is a $\sigma$-algebra and $X$ is $\mathcal{G}$-measurable . I also know that if $A \in \mathcal{F}$ and $\mathcal{G} \subseteq \mathcal{F}$, then $\mathbb{P}(A | \mathcal{G}) := \mathbb{E}(1_A | \mathcal{G})$. However the term on the LHS here is $\mathbb{P}_x( X_{t+1} = y \:| \: \mathcal{F_t})$, if I denote $A := \{X_{t+1} = y\} \subseteq \mathcal{F}_{t+1}$ and I know by the definition of a filtration $\mathcal{F}_t \subseteq \mathcal{F}_{t+1}$, then this means that the left hand side is a random variable, wheras the right hand side is a real number.
Also the book casually mentions a markov chain wrt any filtration is still a markov chain, can anyone give a hint as to how to go about showing this (My main concern is that I cannot write $\mathbb{E}(X | \mathcal{G})$ as an explicit summation unless I have a suitable countable partition of $\Omega$, and I am forced to work with an existential definition - hence I'm not sure how to proceed)
(i) The LHS is the probability that starting at $X_0 = x$ and given $\mathcal{F}_t$, the information of the filtration up to time $t$, the chain ends up at $y$ at time $t + 1$. The RHS is the probability of the chain moving from state $X_t$ to $y$ in one step as given by the probability transition matrix $P$. Note that this is random because it depends on $X_t$, but also that it encodes the Markov property since it only depends on $X_t$ instead of the entire filtration $\mathcal{F}_t$ up to that point in time.
(ii) Conditional expectations are defined as usual in probability. As mentioned above, both LHS and RHS are random variables depending on $\mathcal{F}_t$ and $X_t$ respectively. (Their equality means only the current state $X_t$ at time $t$ is important.)
(iii) When you say that "a Markov chain wrt any filtration is still a Markov chain", I assume that a Markov chain (without any reference to a filtration) simply means a Markov chain with respect to to the natural filtration $\sigma(X_0, X_1, \dots, X_t$). Assume that $(X_t)$ is a Markov chain with respect to a filtration $(\mathcal{F}_t)$. Then we want to show that $$ \mathbb{P}_x\left( X_{t+1} = y \mid \sigma(X_0, X_1, \dots, X_t) \right) = P(X_t, y) . $$ Indeed, by using the tower property (since $\sigma(X_0, X_1, \dots, X_t) \subseteq \mathcal{F}_t$) and the fact that $(X_t)$ is a Markov chain with respect to $(\mathcal{F}_t)$, we have $$ \begin{align} \mathbb{P}_x\left( X_{t+1} = y \mid \sigma(X_0, X_1, \dots, X_t) \right) &= \mathbb{E}_x\left[ 1_{\{X_{t+1} = y\}} \mid \sigma(X_0, X_1, \dots, X_t) \right] \\ &= \mathbb{E}_x\left[ \mathbb{E}_x\left[ 1_{\{X_{t+1} = y\}} \mid \mathcal{F}_t \right] \mid \sigma(X_0, X_1, \dots, X_t) \right] \\ &= \mathbb{E}_x\left[ P(X_t, y) \mid \sigma(X_0, X_1, \dots, X_t) \right] \\ &= P(X_t, y) . \end{align} $$ Thus, $(X_t)$ is also a Markov chain (with respect to $\sigma(X_0, X_1, \dots, X_t)$).