From Wiki :
if $f:\mathbb{R}\times\mathbb{R} \rightarrow \mathbb{R}$ is a smooth function, then $$\int\limits_0^T \frac{\partial f}{\partial W}(W_t,t)\circ \mathrm{d}W_t + \int\limits_0^T \frac{\partial f}{\partial t}(W_t,t)\mathrm{d}t = f(W_T,T) - f(W_0,0)$$ which is akin to the chain rule of ordinary calculus.
How is that akin to the chain rule? I can't see how that relates to $(f\circ g)' = (f'\circ g)\cdot g'$.
Let $f: \mathbb{R}^2 \to \mathbb{R}$ be a differentiable mapping and set
$$g: (0,\infty) \to \mathbb{R}^2, t \mapsto \begin{pmatrix} x(t) \\ y(t) \end{pmatrix}$$
for some differentiable mappings $x: (0,\infty) \to \mathbb{R}$ and $y: (0,\infty) \to \mathbb{R}$. Then the (classical) chain rule states that
$$\frac{d}{dt} f(g(t)) = \left( \partial_x f(x,y) \bigg|_{(x,y)=g(t)} \right) \frac{dx(t)}{dt} + \left( \partial_y f(x,y) \bigg|_{(x,y)=g(t)} \right) \frac{d y(t)}{dt}.$$
If we (formally) multiply both sides with "dt" and integrate the expression, we obtain
$$f(x(T),y(T)) - f(x(0),y(0)) = \int_0^T \frac{\partial}{\partial x} f(x(t),y(t)) \, dx(t) + \int_0^T \frac{\partial}{\partial y} f(x(t),y(t)) \, dy(t).$$
Now if we choose $x(t) := W_t$ and $y(t) := t$, this gives
$$f(W_T,T) - f(W_0,0) = \int_0^T \frac{\partial}{\partial x} f(W_t,t) \, dW(t) + \int_0^T \frac{\partial}{\partial y} f(W_t,t) \, dt. \tag{1}$$
As mentioned above, this is a formal calculation! Replacing the (pathwise) integral $dx(t)$ by the (stochastic) integral $dW_t$ is not at all rigorous. In fact, it turns out that we have to replace $\int_0^T \ldots dW_t$ by the Stratonovich integral $\int_0^T \ldots \, \circ dW_t$; otherwise $(1)$ does, in general, not hold true (this is not difficult to see from Itô's formula). Consequently, we get
$$f(W_T,T) - f(W_0,0) = \int_0^T \frac{\partial}{\partial x} f(W_t,t) \, \circ dW(t) + \int_0^T \frac{\partial}{\partial y} f(W_t,t) \, dt. $$