Consider a Markov process $x(t)$ that takes the discrete values $\{x_i\}_{i=1,\cdots,n}$ and whose probability density evolves according to a master equation: $$ \partial_t p_i(t) = \sum_{j=1}^n[w(i|j) p_j(t)- w(j|i) p_i(t). $$ As usual, we require $w(i|j) \geq 0$ and $\sum_{i=1}^n w(i|j) = 1$. Consider now another Markov process $y(t)$ which can take continuous real values, and whose evolution depends parametrically on the state $x_i$ as $$ \partial_t q(y,t) = \mathsf{A}(x_i) q(y,t), $$ with $\mathsf{A}(x_i)$ being a differential operator which acts on the $y$ dependence. Let $z(t) = (x(t),y(t))$ be the joint stochastic process.
Question: How do I proof (or under which assumptions is true) that $P_i(y,t)$ evolves according to $$ \partial_t P_i(y,t) = \mathsf{A}(x_i) P_i(y,t) + \sum_{j=1}^n[w(i|j) P_j(y,t)- w(j|i) P_i(y,t)], $$ as claimed by Van Kampen in Physica A, 96 3 (1979) [https://doi.org/10.1016/0378-4371(79)90005-0]?
Example: The simplest case I can think of is as follows. Let $x(t) = \{+1,-1\}$ be the dichotomous Markov process with transition rates $w(+|-) = w(-|+) = w(+|+) = w(-|-) = 1/2$. Let $y(t)$ be the velocity in the Langevin equation with a friction coefficient $\gamma(x_i)$ that depends on the state $x_i$; that is, $$ \dot{y}(t) = -\gamma(x_i) y + \xi(t), $$ where $\xi(t)$ is a Gaussian white noise with $\langle \xi(t) \rangle = 0$ and $\langle \xi(t) \xi(t')\rangle = 2 \Gamma \delta (t-t')$. Then the probability $q(y,t)$ for a fixed state $x_i$ evolves with the usual Fokker-Planck equation $$ q(y,t) = \gamma(x_i) \partial_y y q(y,t) + \Gamma \partial_y^2 q(y,t). $$ Then, is the evolution equation for the joint probability $P_i(y,t)$ given by $$ P_+(y,t) = \gamma(+1) \partial_y y P_+(y,t) + \Gamma \partial_y^2 P_+(y,t)+(1/2)(P_-(y,t) - P_+(y,t)),\\ P_-(y,t) = \gamma(-1) \partial_y y P_-(y,t) + \Gamma \partial_y^2 P_-(y,t)+(1/2)(P_+(y,t) - P_-(y,t))? $$