My lecture slides begin,
Consider a sequence of random numbers $X_1,X_2,...,X_n$ (the lecturer here stated that random numbers and random variables - which in my understanding are maps from a sample space to the real line - can be used interchangeably) all sampled from the same distribution which has mean (expectation) $\mu$.
Then the arithmetic average of the sequence
$S_n=\frac{1}{n}(X_1+X_2+...+X_n)\approx E[x]$
and
$S_n=\frac{1}{n}(X_1+X_2+...+X_n)= E[x]$ as $n\rightarrow \infty$
It then goes on to state almost surely convergence
$X_n \rightarrow X \leftrightarrow P(\omega\in\Omega\mid X_n(w)\rightarrow X(w)$ as $n\rightarrow \infty)=1.$
My question is a conceptual one, what do we mean by $X_n\rightarrow X$ and $X_n(w)\rightarrow X(w)$?
For that matter what do we mean by $X_n$? Are these aprroximations to the distribution $X$ after the $n_{th}$ trial if so when summing $S_n$ are we summing "maps from a sample space to the real line"? How would one add a map?
Am I correct in reading almost sure convergence as "$X_n$ becomes the distribution $X$ if and only if the probability of an outcome is 1 such that $X_n$ evaluated at that outcome becomes the value of $X$ evaluated at that outcome as we take successive approximations of $X$, $X_n$"
Please help!
First, a distribution is not the same thing as a random variable. A random variable $X$ is a map from a sample space usually denoted by $\Omega$, to $\mathbb{R}$ or whatever measurable space. The distribution of a random variable is a probability measure over this measurable space.
When we write $S_n = (X_1+\dots +X_n)/n$, it justs means that for each $\omega\in\Omega$, $S_n(\omega) = (X_1(\omega) + \dots + X_n(\omega))/n$. This is the usual sum operation for maps.
Now almost sure convergence is stronger than convergence in distribution, it is a pointwise convergence, for almost every $\omega$. Here almost means that the set of $\omega$'s for which the convergence holds has probability $1$.
So whatever $\omega$ you pick in this probability $1$ set, you'll have, for $n$ large enough, $S_n(\omega)\simeq E[X_1]$. Once you've picked $\omega$, it's just a convergence of a sequence of real numbers. So it's not about the distributions, it's about the random variables themselves, which is stronger.
$X_n\rightarrow X$ is ambiguous because there are several modes of convergence for random variables. We usually specify which one it is. For almost sure convergence we can write $$X_n\rightarrow X\, \textrm{a.s.}$$ a.s means almost surely.