Are these two versions of Itô's lemma consistent with each other?

140 Views Asked by At

I'm reading Itô's lemma from two sources.

  1. Wikipedia page.

In its simplest form, Itô's lemma states the following: for an Itô drift-diffusion process $$ d X_t=\mu_t d t+\sigma_t d B_t $$ and any twice differentiable scalar function $f(t, x)$ of two real variables $t$ and $x$, one has $$ d f\left(t, X_t\right)=\left(\frac{\partial f}{\partial t}+\mu_t \frac{\partial f}{\partial x}+\frac{\sigma_t^2}{2} \frac{\partial^2 f}{\partial x^2}\right) d t+\sigma_t \frac{\partial f}{\partial x} d B_t $$ This immediately implies that $f\left(t, X_t\right)$ is itself an Itô drift-diffusion process. In higher dimensions, if $\mathbf{X}_t=\left(X_t^1, X_t^2, \ldots, X_t^n\right)^T$ is a vector of Itô processes such that $$ d \mathbf{X}_t=\mu_t d t+\mathbf{G}_t d \mathbf{B}_t $$ for a vector $\boldsymbol{\mu}_t$ and matrix $\mathbf{G}_t$, Itô's lemma then states that $$ \begin{aligned} d f\left(t, \mathbf{X}_t\right) & =\frac{\partial f}{\partial t} d t+\left(\nabla_{\mathbf{X}} f\right)^T d \mathbf{X}_t+\frac{1}{2}\left(d \mathbf{X}_t\right)^T\left(H_{\mathbf{X}} f\right) d \mathbf{X}_t \\ & =\left\{\frac{\partial f}{\partial t}+\left(\nabla_{\mathbf{X}} f\right)^T \boldsymbol{\mu}_t+\frac{1}{2} \color{blue}{\operatorname{Tr}\left[\mathbf{G}_t^T\left(H_{\mathbf{X}} f\right) \mathbf{G}_t\right]}\right\} d t+\left(\nabla_{\mathbf{X}} f\right)^T \mathbf{G}_t d \mathbf{B}_t \end{aligned} $$ where $\nabla_{\mathbf{X}} f$ is the gradient of $f$ w.r.t. $X, H_{\mathbf{X}} f$ is the Hessian matrix of $f$ w.r.t. $X$, and Tr is the trace operator.

  1. A lecture note.

Itô formula (Multidimensional). Let $X_t$ solve $d X_t=b(t, \omega) d t+\sigma(t, \omega) d W_t$, where $X_t \in \mathbb{R}^n, \sigma \in \mathbb{R}^{n \times m}$, $W_t \in \mathbb{R}^m$. Let $Y_t=f\left(X_t\right)$, where $f \in C^2\left(\mathbb{R}^n\right)$. Then $$ d Y_t=\nabla f\left(X_t\right) \cdot d X_t+\frac{1}{2}\left(d X_t\right)^T \nabla^2 f\left(X_t\right) d X_t $$ where $\nabla^2 f=\left(\frac{\partial^2 f}{\partial x_i \partial x_j}\right)_{i, j}$ is the Hessian matrix of $f$, and products of increments are evaluated using the rules following (11) plus the additional rule $$ d W_t^i \cdot d W_t^i=d t, \quad d W_t^i \cdot d W_t^j=0 \quad \text { for } i \neq j $$ where $W_t^i$ is the ith component of $W_t$. Therefore $Y_t$ solves the equation $$ d Y_t=\left(b \cdot \nabla f+\frac{1}{2} \color{blue}{\sigma \sigma^T: \nabla^2 f}\right) d t+(\nabla f)^T \sigma d W_t $$ where $A: B=\operatorname{Tr}\left(A^T B\right)=\sum_{i, j} a_{i j} b_{i j}$.

I already verified the formula in the note here. Following notation in the note, $$ \sigma \sigma^T: \nabla^2 f = \operatorname{Tr} ((\sigma \sigma^T)^T (\nabla^2 f)) = \operatorname{Tr} ((\nabla^2 f)^T \sigma \sigma^T) = \operatorname{Tr} ((\nabla^2 f) \sigma \sigma^T). $$

For the Wikipedia page to be consistent with the note, I expect that $$ \operatorname{Tr}\left[\mathbf{G}_t^T\left(H_{\mathbf{X}} f\right) \mathbf{G}_t\right] = \operatorname{Tr}\left[\left(H_{\mathbf{X}} f\right) \mathbf{G}_t \mathbf{G}_t^T\right] \tag{1}. $$

However, it seems (1) is not generally true unless $\mathbf{G}_t$ is symmetric.

Could you elaborate on my confusion?

1

There are 1 best solutions below

0
On BEST ANSWER

We have from this Wikipedia that

The trace of a square matrix which is the product of two real matrices can be rewritten as the sum of entry-wise products of their elements, i.e. as the sum of all elements of their Hadamard product. Phrased directly, if $\mathbf{A}$ and $\mathbf{B}$ are two $m \times n$ real matrices, then: $$ \operatorname{tr}\left(\mathbf{A}^{\top} \mathbf{B}\right)=\operatorname{tr}\left(\mathbf{A B}^{\boldsymbol{\top}}\right)=\operatorname{tr}\left(\mathbf{B}^{\top} \mathbf{A}\right)=\operatorname{tr}\left(\mathbf{B} \mathbf{A}^{\top}\right)=\sum_{i=1}^m \sum_{j=1}^n a_{i j} b_{i j}. $$

We apply above formula with $\mathbf{A} := \mathbf{G}_t$ and $\mathbf{B} := \left(H_{\mathbf{X}} f\right) \mathbf{G}_t$ and get $$ \operatorname{Tr}\left[\mathbf{G}_t^T\left(H_{\mathbf{X}} f\right) \mathbf{G}_t\right] = \operatorname{Tr}\left[\left(H_{\mathbf{X}} f\right) \mathbf{G}_t \mathbf{G}_t^T\right]. $$