Consider a simple Ornstein–Uhlenbeck process $X(t)$:
$$ \mathrm d X(t) = - X(t) \, \mathrm dt + \sqrt{2} \, \mathrm dW(t). \tag{1} $$
If we apply Itô's lemma in its common formulation to get SDE for $X^2(t)$, we obtain
$$ \mathrm d X^2(t) = [- 2X^2(t) + 2] \, \mathrm d t + 2 X(t) \sqrt 2 \, \mathrm dW(t).\tag{2} $$
Note that both equations have the same Wiener process $W(t)$. The fact that they have the same Wiener process seems natural, since $X(t)$ and $X^2(t)$ should be driven by the same source of noise. From these two SDEs we might incorrectly conclude that
$$ \mathrm d X^2(t) - 2 X(t) \, \mathrm d X(t) = 2 \, \mathrm dt. $$
Hence it seems that the quantity $\mathrm d X^2 - 2 X \, \mathrm d X$ is deterministic. However, let us use the Euler–Maruyama discretization scheme ($\xi$ is a normally distributed random number with mean $0$ and variance $1$):
$$ X(t + \Delta t) = X(t) - X(t) \, \Delta t + \sqrt{2} \sqrt{\Delta t} \, \xi(t). $$
From it, we can calculate $\Delta X^2 - 2 X \, \Delta X$ up to $\Delta t$
$$ [X^2(t + \Delta t) - X^2(t)] - 2 X(t) [X(t + \Delta t) - X(t)] = 2 \Delta t \, \xi^2(t) + \ldots, $$
which is a random variable.
Queston: How do I correctly apply Itô's lemma (or something else) to calculate $\mathrm d X^2 - 2 X \, \mathrm d X$ ? Can it be expressed in terms of $dW(t)$? To make it even more clear, I am interested in realization specific identities (strong, not weak sense). How to formulate Ito's lemma so one would avoid paradoxes like that? I used the OU process just as an illustration, I am interested in the case of a general SDE for $X(t)$.
Update: In case you wonder why I insist that $\mathrm dX^2 - 2X \, \mathrm dX \neq 2 \, \mathrm dt $. You can use any program of your choice to simulate $X(t)$ and then calculate $g(t)$ as
$$ g(t) = [X^2(t + \Delta t) - X^2(t)] - 2 X(t) [X(t + \Delta t) - X(t)] $$
for small values of $\Delta t$. You will see that $g(t)$ is random. I attach below a screenshot from Mathematica that does that (I also tried writing my own program in Julia with the same result). I admit that $g(t)$ might not be the correct approximation of $\mathrm dX^2 - 2X \, \mathrm dX$, then please tell me how to do it right.

It's wrong to do $$ \mathrm d X = - X \, \mathrm dt + \sqrt{2} \, \mathrm dW(t)\Longrightarrow X(t + \Delta t) \approx X(t) - X(t) \, \Delta t + \sqrt{2} \sqrt{\Delta t} \, \xi(t) \tag{1} $$ The discretization method does not work on the stochastic differential equations. So, $(1)$ is not correct.
If you compute $X_t$ $$X_t =X_0e^{-t}+\int_0^t\sqrt{2}e^{s-t}dW_s \tag{2}$$ you can then apply the Euler–Maruyama discretization scheme.
For information, the formula $dX^2_t-2X_tdX_t = 2dt$ must be correct. You can use $(2)$ to test it.
From $(2)$, we will prove $$dX_t^2-2X_tdX_t=2dt \tag{3}$$
For the sake of simplicity, we denote $Z_t :=\int_0^t\sqrt{2}e^{s}dW_s$, then $X_t=e^{-t}(X_0+Z_t)$.
We have some results: $$\begin{align} dZ_t &= \sqrt{2}e^tdW_t \tag{4}\\ d(Z_t^2) &= 2Z_tdZ_t + (dZ_t)^2 \stackrel{(4)}{=} 2\sqrt{2}Z_te^tdW_t+2e^{2t}dt\tag{5}\\ X_t^2 &= e^{-2t}(X_0^2+2X_0Z_t+Z_t^2)\tag{6}\\ d(X_t^2) &\stackrel{(6)}{=} e^{-2t}d(X_0^2+2X_0Z_t+Z_t^2)-2e^{-2t}(X_0^2+2X_0Z_t+Z_t^2)dt\\ &=e^{-2t} \left( 2X_0dZ_t+d(Z_t^2) - 2(X_0^2+2X_0Z_t+Z_t^2)dt \right)\\ &\stackrel{(4,5)}{=}e^{-2t} \left( \color{red}{2X_0\sqrt{2}e^tdW_t+2\sqrt{2}Z_te^tdW_t}+2e^{2t}dt - 2(X_0^2+2X_0Z_t+Z_t^2)dt \right)\\ &=e^{-2t} \left( \color{red}{2\sqrt{2}e^{2t}X_tdW_t}+2e^{2t}dt - 2(X_0^2+2X_0Z_t+Z_t^2)dt \right)\\ &= 2\sqrt{2}X_tdW_t+2dt - 2(X_0^2+2X_0Z_t+Z_t^2)e^{-2t}dt \\ &= 2\sqrt{2}X_tdW_t+2dt - 2X_t^2dt \tag{7}\\ X_tdX_t &= \sqrt{2}X_tdW_t -X_t^2 dt \tag{8} \end{align}$$
Finally, from $(7)(8)$, we can prove $(3)$ $$\begin{align} \color{red}{dX_t^2- 2 X_tdX_t} &= (2\sqrt{2}X_tdW_t+2dt - 2X_t^2dt) - 2(\sqrt{2}X_tdW_t -X_t^2 dt) = \color{red}{2dt} \end{align}$$
Remark: it's quite time-consuming! Luckily I can reach the end of the proof.
Again, we cannot apply the discretization scheme to SDEs because that is the source of errors. Take for example a well-know SDE $$\frac{dS_t}{S_t}=\sigma dW_t \tag{9}$$ where the solution is $$S_t = S_s\cdot \exp\left(-\frac{1}{2}\sigma^2 (t-s) + \sigma (W_t-W_s) \right) \tag{10}$$
With a discretization time step $t_n = \Delta t \cdot n$, from $(10)$, we have $$\begin{align} S_{t_n} &= S_{t_{n-1}}\exp\left(-\frac{1}{2}\sigma^2 \Delta t + \sigma \sqrt{\Delta t} \mathcal{N}(0,1) \right) \\ \text{or}\hspace{0.5cm} S_{t_n} &\approx S_{t_{n-1}} \left(1 \color{red}{-\frac{1}{2}\sigma^2 \Delta t} + \sigma \sqrt{\Delta t} \mathcal{N}(0,1) \right) \end{align}$$
If we use $(9)$, the red term is missing $$S_{t_n} \approx S_{t_{n-1}}\left(1 + \sigma \sqrt{\Delta t} \mathcal{N}(0,1) \right)$$