Intuition - Ito's formula

Question

Intuition - Ito's formula

147 Views Asked by Bumbble Comm At 28 Mar 2026 - 1:59

Here is some intuition for Ito's formula.

The Taylor expansion for a function $f$ about a point $y$ is $$ f(y) = f(x) + f'(x)(x-y) + \frac12 f''(x)(x-y)^2 + \dots \,.$$

If you replace $x-y$ with $dx$ and $f(y) - f(x)$ with $df(x)$, then $$ df(x) = f'(x)dx + \frac12 f''(x) dx^2 + \dots \,.$$

If you keep only the first term, you have the formula for the differential, $df = f' dx$.

If you keep the first two terms, you have Ito's formula, $ df = f'dx + \frac12 f'' dx^2$.

Is there some explanation for why functions of stochastic processes need the second derivative term when taking the differential of $f$? I know that we use the fact that "$dz^2 = dt$", where $z$ is a Brownian motion, but I don't fully understand that. I know $\mathbb{E}[B_t^2] = t$, is that related? Why is the second-order term in the Taylor expansion for the regular differentual zero?

Edit: on slide 8 of these lecture notes, we have If $dX_t = a dt + b dB_t$ is an Ito process, then \begin{align*} (dX_t)^2 &= (adt + b dB_t)^2 \\ &= a^2 dt^2 + 2(adt)(bdB_t) + (bdB_t)^2 \\ &= bdB_t^2 \,. \end{align*}

Why are the first two terms zero? (I also don't understand why the term $dB_t dt =0$, on page 10.)

Original Q&A

There are 2 best solutions below

Bumbble Comm On 10 Dec 2023 - 1:59

As explained here Intuition between Ito-Formula, after we Taylor expand (using the Mean-value remainder) we have

$$f(B_t)=f(B_0)+\sum_{i=0}^{n-1}f'(B_{t_{i}})(B_{t_{i+1}}-B_{t_{i}})+\frac{1}{2}\sum_{i=0}^{n-1}f''(\theta_i)(B_{t_{i+1}}-B_{t_{i}})^2.$$

So now if we had that Brownian motion was of bounded variation then the last term is bounded by

$$\frac{1}{n}|\sum_{i=0}^{n-1}|f''(\theta_i)|B_{t_{i+1}}-B_{t_{i}}|\approx \frac{1}{n}\int |f''(s)||dB_s|\to 0 $$

and so it disappears. However, the reality is that Brownian motion is not of bounded variation but it is of finite quadratic-variation i.e. $\sum (B_{t_{i+1}}-B_{t_{i}})^2<\infty$ in $L^2$, and so Itô instead worked to define his integral in terms of the quadratic variation.

The same idea works if we have p-variation, see rough paths eg. Multidimensional Stochastic Processes as Rough Paths.

See also geometric intuition here https://en.wikipedia.org/wiki/It%C3%B4%27s_lemma.

**Bumbble Comm** · Accepted Answer

Thomas Kojar has provided an answer and some references already, but here is an intuitive explanation, in order to stay in the same context/spirit as your question.

1) Why $\mathrm{d}x^2 \sim 0$ in the standard case ?

It is to be recalled that, in the standard case of a function depending on a non-stochastic variable, the differential $\mathrm{d}f(x)$ is a somewhat "halfway unfinished computation" of the derivative $f'(x)$, so that $$ \frac{\mathrm{d}f}{\mathrm{d}x} = f'(x) + \frac{1}{2}f''(x)\,\mathrm{d}x + \ldots, $$ with all terms except for the first one vanishing when $\mathrm{d}x \rightarrow 0$. The confusing ambiguity comes from the fact that usually the limit is already contained implicitly inside the "d" notation; here, the notation is a little bit abused by using $\mathrm{d}f$ and $\mathrm{d}x$ before taking the limit.

2) How to treat $\mathrm{d}X_t^2$ in the stochastic case and why $\mathrm{d}B_t\mathrm{d}t \sim 0$ ?

In contrast, when the independent variable $x$ is random, namely $X_t$, then (some) terms inside $\mathrm{d}f(X_t)$ coming from $\mathrm{d}X_t^2$ cannot be ignored, because $\mathbb{E}[\mathrm{d}B_t^2] = \mathrm{d}t$. It is to be noted that another abuse of notation $-$ in a way, stochastic calculus is full of formalized abuses of notation $-$ is usually made by dropping the average, i.e. $\mathrm{d}B_t^2 = \mathrm{d}t$.

As before, higher-order terms are again considered as negligible in the limit $\mathrm{d}t \rightarrow 0$, because they would vanish when computing a yet-to-be-formalized derivative $\frac{\mathrm{d}f(X_t)}{\mathrm{d}t}$ (with $\frac{\mathrm{d}B_t}{\mathrm{d}t}$ being interpreted as a white noise in that case). In consequence, the $o(\mathrm{d}t)$ terms, i.e. all the supralinear terms with respect to $\mathrm{d}t$, are (implicitly) cut from the initial Taylor expansion. In that point of view, $\mathrm{d}B_t$ is kept because $\mathrm{d}B_t = \mathcal{N}(0,\mathrm{d}t) = \mathcal{N}(0,1)\sqrt{\mathrm{d}t} \sim \mathrm{d}t^{1/2}$, but $\mathrm{d}B_t\mathrm{d}t$ is ruled out because $\mathrm{d}B_t\mathrm{d}t \sim \mathrm{d}t^{3/2} = o(\mathrm{d}t)$.

Final remark

All the above developments are valid when the stochastic process $X_t$ is made of a deterministic component, represented by the drift term $a_t\mathrm{d}t$, and a random phenomenon modelled by a normal distribution, represented here by the gaussian noise $b_t\mathrm{d}t$. When the random event in question is not gaussian, for example in the case of a Poisson process, then you will need to adapt and rederive Itô's lemma, because the relation $\mathrm{d}B_t^2 \sim \mathrm{d}t$ is not true anymore.

Intuition - Ito's formula

There are 2 best solutions below

Related Questions in STOCHASTIC-PROCESSES

Related Questions in STOCHASTIC-CALCULUS

Related Questions in STOCHASTIC-INTEGRALS

Related Questions in STOCHASTIC-ANALYSIS

Related Questions in STOCHASTIC-DIFFERENTIAL-EQUATIONS

Trending Questions

Popular # Hahtags

Popular Questions