Can you prove directly from the definition of Ito integral (By that I mean limit of simple functions)
$$\int_{0}^{t}B(s)^{2} dB(s) = \frac{1}{3}B(t)^{3}-\int_{0}^{t} B(s) ds $$
So I started by writing $B(t)^{3} - B(0)^{3}= \sum_{0}^{n-1}B(s_{i+1})^{3}-B(s_{i})^{3}$ Where $0=s_{0}<s_{1}<...<s_{n}=t $ and $s_{i+1}-s_{i}$ tend to 0 as n tends to infinity. Where to go from here? Maybe bring in $(B(s_{i+1})-B(s_{i}))^{3}$.
Thanks!
For simplicity I suppose $t=1$, even if all arguments should be adabtable for a generic $t>0$.
We introduce the mesh $a_i=\{\frac{i}{2^n},i = 0,..,2^n\}$ and note that:
$B(1)^3=\sum_{k=0}^{2^n-1}B^3(a_{k+1})-B^3(a_{k})$
Now we can exploit that $a^3-b^3=(a-b)(a^2+ab+b^2)$. Therefore:
\begin{align} B(1)^3=&\sum_{k=0}^{2^n-1}(B(a_{k+1})-B(a_{k}))(B^2(a_{k+1})+B^2(a_{k})+B(a_{k})B(a_{k+1}))=\\ &=3\sum_{k=0}^{2^n-1} (B(a_{k+1})-B(a_{k}))B^2(a_{k})+ \sum_{k=0}^{2^n-1} (B(a_{k+1})-B(a_{k}))^2(B(a_{k+1})+2B(a_{k})) \end{align}
, where only algebraic manipulations have been used.
At this level by definition:
$\sum_{k=0}^{2^n-1} (B(a_{k+1})-B(a_{k}))B^2(a_{k}) \rightarrow \int_0^1 B^2(s)dB(s)$
, where convergence in $L^2$, under the limit of fine meshes (large n) is implied. The final result follows by observing that:
\begin{align} &F(n)\equiv\sum_{k=0}^{2^n-1} (B(a_{k+1})-B(a_{k}))^2 B(a_{k+1}) \rightarrow \int_0^1 B(s) ds\\ &G(n)\equiv\sum_{k=0}^{2^n-1} (B(a_{k+1})-B(a_{k}))^2 B(a_{k}) \rightarrow \int_0^1 B(s) ds \end{align}
This is expected to hold by the naive argument $dB^2(s) \sim ds $ and, with a bit of patience, can be formally derived. I sketch a proof, leaving out the details.
First we show that $F(n)-G(n) \rightarrow 0$:
$E[|F(n)-G(n)|^2]=\sum_{k,k'}E[(B(a_{k+1})-B(a_{k}))^3(B(a_{k'+1})-B(a_{k'}))^3]=\sum_{k}E[(B(a_{k+1})-B(a_{k}))^6] \rightarrow 0$
, because $E[(B(a_{k+1})-B(a_{k}))^6] \sim (a_{k+1}-a_{k})^3$ and the independence of the increments has been used.
Then we prove that $G(n) \rightarrow \int_0^1 B(s) ds$. Since $\sum_k B(a_k)(a_{k+1}-a_k) \rightarrow \int_0^1 B(s)ds$ it is sufficient to show that:
$E[\left[\sum_{k=0}^{2^n-1} [(B(a_{k+1})-B(a_{k}))^2-(a_{k+1}-a_k)] B(a_{k}) \right]^2]\le \sum_{k=0}^{2^n-1} E\left[ [(B(a_{k+1})-B(a_{k}))^2-(a_{k+1}-a_k)]^2 B^2(a_{k})\right]= \sum_{k=0}^{2^n-1} E[ [(B(a_{k+1})-B(a_{k}))^2-(a_{k+1}-a_k)]^2]a_{k} \sim \\ \sum_{k=0}^{2^n-1} (a_{k+1}-a_{k})^2 a_{k} \rightarrow 0$
, where the moments of the gaussian distributions have been used and constants neglected in the last estimation. The independence of $B(a_{k})$ from $B(a_{k+1})-B(a_{k})$ and the value $B^2(a_{k})=a_k$ have also been used.