Law of large numbers with incomplete observation

Question

Law of large numbers with incomplete observation

85 Views Asked by Bumbble Comm At 02 Apr 2026 - 8:53

I am currently reading the book Introduction to Reinforcement Learning by R. S. Sutton and A. G. Barto. The authors often reason with the LLN. In particular, at one point there is an expression like this (beginning of Section 2.2 - Action-value Methods) $$ \frac{\sum_{i = 1}^{t-1}R_i \mathbb{1}_{\{A_i = a\}}}{\sum_{i=1}^{t-1} \mathbb{1}_{\{A_i = a\}}}, $$ where $R_i$ are the rewards and $A_i$ are the actions taken at time $i$. If I understand correctly, they claim that by the LLN, this expression converges to the mean of $R_i$ as long as Action $a$ is chosen infinitely often. Intuitively this of course makes sense, but I am not convinced. I am familiar with the LLN like this: Take $X_1,X_2,\dots$ iid, where $\mathbb{E}[|X_1|]$ exists. Then $$ \lim_{n \to \infty} \frac1n \sum_{i=1}^{n} X_i = \mathbb{E}[X_1] \hspace{20pt} $$ almost surely. I tried to recreate the situation from the book like this: We have two sequences $X_1,X_2,\dots$ iid and $C_1,C_2,\dots$ iid (if necessary, let the two sequences be independent), where $\mathbb{E}[|X_1|]$ and $\mathbb{E}[|C_1|]$ exist. Let $\mathbb{P}[C_i = \pm 1] = 1/2$. Then intuitively the expression $$ \frac{\sum_{i = 1}^{n}X_i \mathbb{1}_{\{C_i = 1\}}}{\sum_{i=1}^{n} \mathbb{1}_{\{C_i = 1\}}} $$ should indeed converge almost surely to $\mathbb{E}[X_1]$, since the probability of $\{C_i = 1\}$ only finitly many times is zero (i.e. you observe $X_i$ infinitely often almost surely).

If this is correct, how can you argue this rigorously?

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

For general case, you can try Toeplitz lemma:

Toeplitz Lemma. Suppose that $x_{n}\to x$ and for $a_{i}\geq0$ with $\sum_{i=1}^{n}a_{i}\to\infty$ as $n\to\infty$, then $$\frac{\sum_{i=1}^{n}a_{i}x_{i}}{\sum_{i=1}^{n}a_{i}}\to x.$$

Under your recreated situation, say $X_{i}$ and $C_{i}$ are independent, the proof becomes much easier. By LLN, with probability 1, you have $$\frac{1}{n}\sum_{i=1}^{n}X_{i}1_{\{C_{i}=c\}}\to\mathbb{E}[X_{i}1_{\{C_{i}=c\}}]=\mathbb{E}[X_{i}]\mathbb{E}[1_{\{C_{i}=c\}}],$$ in the meantime, with probability 1, $$\frac{1}{n}\sum_{i=1}^{n}1_{\{C_{i}=c\}}\to\mathbb{E}[1_{\{C_{i}=c\}}].$$ Since $\mathbb{E}[1_{\{C_{i}=c\}}]=\mathbb{P}(C_{i}=c)$ must be nonzero by Borel–Cantelli lemma, a simple divide yields desired result.

Law of large numbers with incomplete observation

There are 1 best solutions below

Related Questions in PROBABILITY

Related Questions in STATISTICS

Related Questions in LAW-OF-LARGE-NUMBERS

Trending Questions

Popular # Hahtags

Popular Questions