In order to prove Ito's lemma, one of the basic requirements to be made is assuming the function $f(X_t, t)$ to be (at least) twice differentiable. In this way, $f$ can be expanded via a Taylor series.
But what does it mean differentiability with respect to a random process $X_t$? Should $\displaystyle \frac{\partial f(X_t,t)}{\partial{X_t}}$ be seen as $$ \lim_{dt\to0}\frac{f(X_{t+dt},t)-f(X_t,t)}{X_{t+dt}-X_t}? $$ If yes, in what sense? In probability? In distribution?...
No, it means that the real valued function $f$ should be differintiable, i.e. $\frac{\partial f(x,t)}{\partial x}$ should exist for all $x$. The requirement $$\lim_{dt \rightarrow 0} \frac{f(X_{t+dt},t)-f(X_t,t)}{X_{t+dt}-X_t}$$ exists would be far too restrictive, and would rule out even functions like $f(X_t,t) = X_t$ for a lot of stochastic processes.