Assume that the sequence of random variables $(X_n)_{n\in \mathbb{N}}$ defined on some probability space $(\Omega,\Sigma, \mathbb{P})$ converges uniformly to $X$, that is $$\sup_{\omega \in \Omega}|X_n(\omega)-X(\omega)|\longrightarrow 0 \quad \text{for } n\longrightarrow \infty.$$ Then this implies convergence of $X_n$ in expectation, that is $$|\mathbb{E}(X_n)-\mathbb{E}(X)|\longrightarrow 0 \quad \text{for } n\longrightarrow \infty.$$ A proof could(?) be the following:
We have $$|\mathbb{E}(X_n)-\mathbb{E}(X)| = |\int_{\Omega}X_n-X d\mathbb{P}| \leq \int_\Omega|X_n-X|d\mathbb{P} = \int_\Omega |X_n(\omega)-X(\omega)|d\mathbb{P}(\omega)\leq\int_\Omega \sup_{\omega \in \Omega}|X_n(\omega)-X(\omega)|d\mathbb{P}(\omega)$$ where we use the definition of the expectation and its linearity, then the general inequality for moving the absolute value inside the integral and then 'an alternative notation' of the Lebesgue integral in the third step. The last inequality follows from the fact that taking the supremum over all $\omega\in \Omega$ clearly makes the integral bigger. Then, since the integrand of the last integral goes to zero, so does the entire integral. My problem is that I do not understand what exactly $d\mathbb{P}(\omega)$ means. Is there something else happening in the third step aside from using an alternative notation?
If we assume that $X_n$ and $X$ are continuous, then there exists density functions $f_n$ and $f$ and convergence of expectation is implied by $$\int_\Omega |x||f_n(x)-f(x)|dx$$going to zero. Can this approach somehow be completed to a proof in the continuous case?
Your proof is fine. Yes, the third equality is nothing more than a change of notation. All they did was write in $\omega$ as the dummy variable of integration, like $\int f\,d\mu$ versus $\int f(x)\, d\mu(x)$. I guess they thought it would make the following step more clear, though I'm not sure that it really did.
Your proposed argument in the last paragraph won't work. The problem is that any formula you write down that only uses $f_n$ and $f$ will only depend on the marginal distributions of $X_n$ and of $X$. But when you have an event that involves all the $X_n$ and $X$, like $|X_n - X| \to 0$, then its probability depends on the joint distribution of the infinitely many random variables $(X_1, X_2, \dots, X)$.