I have seen it written that the integrable process $(X_1,\ldots,X_n)$ is a martingale if and only if $$ \mathbb{E}[h(X_1,\ldots,X_k)(X_{k+1} - X_k)] =0, $$ for all continuous, bounded functions $h$, for each $k=1,\ldots,n-1$.
The definition of the martingale states that the process is integrable and that $\mathbb{E}[X_{k+1} | X_1,\ldots,X_k] = X_k$ for each $k=1,\ldots,n-1$, that is, $$ \mathbb{E}[X_{k+1} \chi_{A}] = \mathbb{E}[X_{k} \chi_{A}] $$ for each $A \in \sigma(X_1,\ldots,X_k)$ (where $\chi$ is an indicator function).
How do I convert backwards and forwards between the two characterisations? I feel it has something to do with the Monotone Class theorem for one direction, but I am not exactly sure how I would apply this, if at all.
I believe by taking simple function approximations to a continuous, bounded function and applying the montone convergence theorem we obtain one direction. But I am not sure about the other way.
Thanks in advance.
We have to prove that if $Y$ is an integrable random variable such that $$\tag{1} \mathbb{E}[h(X_1,\ldots,X_k)Y] =0, $$ for each continuous and bounded function $h\colon\mathbb R^k\to \mathbb R$, then $\mathbb E\left[Y\chi_A\right]=0$ for all $A \in \sigma(X_1,\ldots,X_k)$ or in other words, that for all Borel subset $B$ of $\mathbb R^k$, $$\tag{2} \mathbb E\left[Y\chi_{\left\{(X_1,\dots,X_k)\in B \right\}}\right]=0. $$
We will prove it when $B$ is a closed set. Let $B\subset \mathbb R^k$ be closed; denote by $d(x,A):=\inf\{\lVert x-y\rVert, y\in A\}$ the distance of $x$ to a set $A$. Denote $B_n:=\{x,d(x,B)\leqslant 1/n\}$ and define the function $$h_n\colon x\mapsto 1-\frac{d(x,B_n)}{d(x,B_n)+1/n} .$$ For each fixed $n$, this function is continuous and bounded. Moreover, $h_n(x)\to \chi_B(x)$ for all $x$ hence (2) follows from (1) applied to $h_n$ and a use of the dominated convergence theorem.
We use the following facts:
Fix $\varepsilon\gt 0$ and a Borel subset $B$ of $\mathbb R^k$. By fact 1, fix $\delta$ such that $\mathbb E\left[\left\lvert Y\right\rvert\chi_A\right]\lt\varepsilon$ whenever $A$ has measure smaller than $\delta$. By fact 2. applied to $\mu=\mathbb P_{(X_1,\dots,X_k)}$, we can find $F\subset B$ closed such that $\mathbb P_{(X_1,\dots,X_k)}(B\setminus F)\lt \delta$. Then apply item 1. with $F$ instead of $B$ and conclude.