Let $(E,d)$ be a compact metric space, $(T(t))_{t\ge0}$ be a strongly continuous contraction semigroup on $C(E)$, $(\Omega,\mathcal A,\operatorname P)$ be a probability space, $(X_t)_{t\ge0}$ be an $E$-valued process on $(\Omega,\mathcal A,\operatorname P)$ with $$\operatorname E\left[f(X_t)\mid\mathcal F^X_s\right]=(T(t-s)f)(X_s)\;\;\;\text{almost surely},\tag1$$ where $\mathcal F^X$ denotes the filtration generated by $X$, for all $f\in C(E)$ and $t\ge s\ge0$.
By $(1)$, it is easy to see that $$\operatorname E\left[\left|f(X_{t+h})-f(X_t)\right|^2\mid\mathcal F^X_t\right]\le\left\|T(h)f^2-f^2\right\|_\infty+2\left\|f\right\|_\infty\left\|T(h)f-f\right\|_\infty\xrightarrow{h\to0+}0\tag2$$ for all $f\in C(E)$ and $t\ge0$.
How can we conclude that $$\sup_{x\in E}\operatorname E\left[d(X_s,X_t)\wedge1\mid X_0=x\right]\xrightarrow{s-t\to0}0?\tag3$$
For fixed $s,t\ge0$, $\operatorname E\left[d(X_s,X_t)\wedge1\mid X_0=\;\cdot\;\right]$ is defined to be a Borel measurable function $h_{s,\:t}:E\to[0,\infty)$ with $$\operatorname E\left[d(X_s,X_t)\wedge1\mid X_0\right]=h_{s,\:t}\circ X_0\;\;\;\text{almost surely}\tag4;$$ $h_{s,\:t}$ is unique up to equality $\operatorname P\circ X_0^{-1}$-almost surely.
Clearly, by $(2)$ and Jensen's inequality, $$\operatorname E\left[\left|f(X_s)-f(X_t)\right|\mid\mathcal F^X_{s\:\wedge\:t}\right]\xrightarrow{s-t\to0}0\tag5$$ for all $f\in C(E)$. In particular, $(5)$ is true for $f=f_n$, $n\in\mathbb N$.
So, how do we need to argue with the equivalent metric and why is $(3)$ even well-defined (what's worrying me is that the regular version of the conditional expectation is only uniquely determined up to a null set)?
EDIT: Let $\varepsilon>0$. Since $\sum_{k\in\mathbb N}2^{-k}=1<\infty$, there is a $K\in\mathbb N$ with $\sum_{k\in\mathbb N}2^{-k}<\varepsilon/2$. By $(5)$, there is a $\delta>0$ with $$\operatorname E\left[\left|f_k(X_s)-f_k(X_t)\right|\mid\mathcal F^X_{s\:\wedge\:t}\right]<\frac\varepsilon{2\sum_{k=1}^k2^{-k}}\tag6$$ for all $s,t\ge0$ with $|s-t|<\delta$ and $k\in\left\{1,\ldots,K\right\}$. Thus, by the dominated convergence theorem, \begin{equation}\begin{split}\operatorname E\left[\rho(X_s,X_t)\mid\mathcal F^X_{s\:\wedge\:t}\right]&=\sum_{k=1}^\infty a_k\operatorname E\left[1\wedge\left|f_k(X_s)-f_k(X_t)\right|\mid\mathcal F^X_{s\:\wedge\:t}\right]\\&\le\sum_{k=1}^Ka_k\operatorname E\left[\left|f_k(X_s)-f_k(X_t)\right|\mid\mathcal F^X_{s\:\wedge\:t}\right]+\sum_{k>K}a_k<\varepsilon.\end{split}\tag8\end{equation} I guess we somehow need to infer $$\operatorname E\left[\rho(X_s,X_t)\mid\mathcal F^X_{s\:\wedge\:t}\right]\xrightarrow{s-t\to0}0\tag6$$ from $(5)$.