Suppose $X, Y$ are bounded random variables and we have that for every $m, n$ positive integers, $\mathbb{E}[X^mY^n] = \mathbb{E}[X^m]\mathbb{E}[Y^n]$. Then show that $X, Y$ are independent.
I have some idea of how this is supposed to go:
Using linearity, etc, $\mathbb{E}[f(X)g(Y)]$ = $\mathbb{E}[f(X)]\mathbb{E}[g(Y)]$ for all polynomials $f, g$.
After this we use limit theorems to extend this across all measurable functions, and thus characteristic functions, which will let us conclude that for every measurable set $A, B$, we have that $\mathbb{E}[1_{X \in A}\cdot 1_{Y \in B}] = \mathbb{E}[1_{X \in A}\cdot 1_{Y\in B}]$ which reduces to $P(X \in A, Y \in B) = P(X \in A)P(Y \in B)$ which gives independence.
But all the limit identities and how to use them has never been natural to me, so I wanted to verify and ask for help for the details.
First, for continuous functions $f, g$, we use the Stone-Weierstrass theorem to get polynomials sequences $f_n, g_n$ that converge uniformly to $f, g$ on the bounded interval that $X, Y$ belong to. Then $\mathbb{E}[f_n(X)] \rightarrow \mathbb{E}[f(X)]$, and similar for $Y, g$. Which theorem exactly are we using here for this convergence (assuming I'm not doing something completely wrong of course); dominated convergence theorem? Once we get this, we also then demonstrate that $\mathbb{E}[f_n(X)g_n(Y)] \rightarrow \mathbb{E}[f(X)g(Y)]$ which requires showing $f_ng_n \rightarrow fg$, but this just levarages standard analysis techniques and uses the uniform convergence, am I correct?
Next we want to extend this to all measurable functions $f, g$. For this we use the idea that there are sequences $f_n, g_n$ that will converge a.e pointwise to $f, g$. Is this enough to conclude that $\mathbb{E}[f_n(X)] \rightarrow \mathbb{E}[f(X)]$, and similar for $Y, g$? It seems we need stronger assumptions here. I'm not entirely sure how valid this last step is and would appreciate some detail.
Your strategy almost works, except that not every simple function is the pointwise limit of continuous functions (e.g., $1_{\mathbb{Q}\cap[0,1]}$ since it has Baire class 2), so you need to find a way around that. The usual way (asked before many times, see e.g. here, here and here) is to go via the characteristic function $\varphi_{\vec{X}}(\vec\theta):=\mathbb{E}[\exp(i\vec{\theta}\cdot\vec{X})]$ or its cousin the moment generating function, but let me describe a more measure-theoretic way with the monotone class theorem.
Let's start from the beginning. We could transfer the problem to $\Omega=[-M,M]^2$ and $X,Y$ being the coordinate function without any loss of generality.
Stone-Weierstrass and uniform convergence: if $X_n\to X$ uniformly, then $\mathbb{E}X_n\to\mathbb{E}X$ since $\mathbb{E}\lvert X_n-X\rvert\leq\sup\lvert X_n-X\rvert\to 0$. In particular, if $f,g$ are continuous, then there exists $f_n,g_n$ polynomials converging uniformly to $f,g$. Then $(x,y)\mapsto f_n(x)g_n(y)\to f(x)g(y)$ uniformly and thus $\mathbb{E}[f_n(X)g_n(Y)]\to\mathbb{E}[f(X)g(Y)]$. Hence $\mathbb{E}[f(X)g(Y)]=\mathbb{E}[f(X)]\mathbb{E}[g(Y)]$ for all continuous $f,g$.
Then we appeal to
Here we use $\mathcal{K}_1$ is our polynomials on $[-M,M]$ and $\mathcal{H}_1$ the set of all functions $f$ such that $\mathbb{E}[f(X)g(Y)]=\mathbb{E}[f(X)]\mathbb{E}[g(Y)]$ for all $g\in\mathcal{K}_1$. By MCT, $\mathcal{H}_1$ contains all bounded measurable functions on $[-M,M]$. Now MCT again, with $\mathcal{K}_2=\mathcal{B}_b$ and $\mathcal{H}_2$ being those $g$ satisfying $\mathbb{E}[f(X)g(Y)]=\mathbb{E}[f(X)]\mathbb{E}[g(Y)]$ for all $f\in\mathcal{B}_b$ gives all $g\in\mathcal{B}_b$ too.