I was reading Chapter 17 of Hamilton's "Time Series Analysis" about univariate processes with unit roots. In particular, I am looking at the AR(1) process $y(t) = y(t-1) + \epsilon(t)$ with $\epsilon(t) \sim N(0, \sigma^2)$ i.i.d.. The OLS estimator for $\rho$ reads as $\hat{\rho} = \frac{\sum_{t=1}^T \epsilon(t)y(t-1)}{\sum_{t=1}^T y^2(t-1)}$. The proof of the convergence of the numerator (rescaled by $T^{-1}\sigma^{-2}$) to $\frac{1}{2}\left(\chi^2(1) - 1\right)$ is clear and straightforward. I am a bit puzzled about the usage of tools like the Functional Central Limit Theorem and Continuous Mapping Theorem to study the convergence of the denominator (rescaled by $T^{-2}\sigma^{-2}$) to $$\int_0^1 W^2(s)\mathrm{d} s.$$ Can't one just use the definition of integral to see this?
This is what I thought: if $t$ is an integer, one can always write $W(t) = \sum_{s=1}^t \left(W(s) - W(s-1)\right) \sim \sum_{s=1}^t \xi(s)$ with $\xi \sim N(0, 1)$ i.i.d.. We also know that $\sqrt{T} W\left(\frac{t}{T}\right)$ is a Brownian motion (e.g., Prove the scaling property of a Brownian motion.). Thus why can't one write (bearing in mind that $y(t) = \sum_{s=1}^t \epsilon(s)$ assuming $y(0) = 0$ like in Hamilton's book)
$$ \frac{1}{T}\sum_{t=1}^T \frac{1}{T}\left(\sum_{s=1}^{t-1} \frac{\epsilon(s)}{\sigma}\right)^2 \sim \frac{1}{T} \sum_{t=1}^T \frac{1}{T}W^2(t-1) \sim \frac{1}{T} \sum_{t=1}^T W^2\left(\frac{t-1}{T}\right) =\frac{1}{T} \sum_{t=1}^T W^2\left(\frac{t}{T}\right) - \frac{1}{T}W^{2}(1) \approx \int_0^1 W^2(s) \mathrm{d}s $$
(and the last approximation is just the definition of Riemann integral - the last correction term $T^{-1}W^2(1)$ seems to me to go to 0 in $L^1$). This should be understood at a fixed "path" $\omega$ - and $t \mapsto W(t, \omega)$ is continuous almost surely. Shouldn't this ensure the convergence in distribution? Is it wrong in principle, or this argument can be adjusted to become a "real" proof? Thanks a lot!
EDIT: Ok, maybe I understood what doesn't work... Here I am in the scenario where $X_n \to X$ and $Y_n \to Y$ in distribution - hence I cannot use Slutsky's Theorem (as it would need one of the two sequences to converge in probability to a constant, which is not the case here). In principle by using the fact that $Y > 0$ a.s., I could still hope to just invoke the Continuous Mapping Theorem here as $g(x, y) = xy^{-1}$ is continuous, but I do not have independence of the limit distributions as the $\chi^2$ in the numerator comes as $W^2(1)$ from the same Brownian motion... Maybe that's the reason why one needs the FCLT?