Let us pretend that we have to numerically approximate the following stochastic integral: \begin{equation} \displaystyle\int_0^T h(t)dW(t) \tag{1} \end{equation} with $dW(t)$ denoting Wiener increment and $h(t)=W(t)$.
Let us consider $t_j=j\cdot\delta t$ and the "left-point" approximation to $(1)$ corresponding to: \begin{equation} \displaystyle\sum_{j=0}^{N-1}h\left(t_j\right)\left(W\left(t_{j+1}\right)-W\left(t_j\right)\right) \tag{2} \end{equation} Now let us rewrite $(2)$ with explicit substitution $h(t_j)=W(t_j)$. We get: \begin{equation*} \begin{split} \displaystyle\sum_{j=0}^{N-1}h\left(t_j\right)\left(W\left(t_{j+1}\right)-W\left(t_j\right)\right)&=\displaystyle\sum_{j=0}^{N-1}W\left(t_j\right)\left(W\left(t_{j+1}\right)-W\left(t_j\right)\right)\\&=\dfrac{1}{2}\bigg(W(T)^2-W(0)^2-\displaystyle\sum_{j=0}^{N-1}(W(t_{j+1})-W(t_j))\bigg) \end{split} \end{equation*} Now, the term $\displaystyle\sum_{j=0}^{N-1}(W(t_{j+1})-W(t_j))$ can be shown to have expected value $T$ and variance of $O(\delta t)$. Hence, taking $\lim$ as $\delta t\to0$ we expect this random variable to be close to the constant $T$. That is: \begin{equation} \displaystyle\int_{0}^{T}W(t)dW(t)=\dfrac{1}{2}W(T)^2-\dfrac{1}{2}T \tag{3} \end{equation}
I have a doubt as to $(3)$. Why does $\mathbb{E}\left(W(T)^2\right)$ simply correspond to $W(T)^2$ in $(3)$?
Why can I consider $W(T)$ (i.e. the value of the Wiener process at time $T$) as a constant? Why isn't it a random variable?
There is some hand-wavy rotation by 90 degrees involved. The real chain of argument goes over the central limit theorem.
Each of the increments $W_{t_{i+1}}-W_{t_i}$ is a normal random variable $N(0,δt)$, so that $(W_{t_{i+1}}-W_{t_i})^2$ has mean $δt$ and finite variance. Thus the sum $$ \sum_{j=0}^{N-1}(W_{t_{i+1}}-W_{t_i})^2 $$ has an increasingly good approximation by a random variable $\sim N(N\cdotδt,c\cdotδt^2)=N(T,c⋅δt^2)$ (for some fixed $c$ related to the normal distribution).
Thus in the limit $δt\to0$, the square sum has value $T$ with probability $1$. Up to determining the nature of the increments, there are no expectations over the probability space used in the computation, only sums over the time subdivision (which is the perpendicular direction to the probability space).