I am currently reading the book Introduction to Reinforcement Learning by R. S. Sutton and A. G. Barto. The authors often reason with the LLN. In particular, at one point there is an expression like this (beginning of Section 2.2 - Action-value Methods) $$ \frac{\sum_{i = 1}^{t-1}R_i \mathbb{1}_{\{A_i = a\}}}{\sum_{i=1}^{t-1} \mathbb{1}_{\{A_i = a\}}}, $$ where $R_i$ are the rewards and $A_i$ are the actions taken at time $i$. If I understand correctly, they claim that by the LLN, this expression converges to the mean of $R_i$ as long as Action $a$ is chosen infinitely often. Intuitively this of course makes sense, but I am not convinced. I am familiar with the LLN like this: Take $X_1,X_2,\dots$ iid, where $\mathbb{E}[|X_1|]$ exists. Then $$ \lim_{n \to \infty} \frac1n \sum_{i=1}^{n} X_i = \mathbb{E}[X_1] \hspace{20pt} $$ almost surely. I tried to recreate the situation from the book like this: We have two sequences $X_1,X_2,\dots$ iid and $C_1,C_2,\dots$ iid (if necessary, let the two sequences be independent), where $\mathbb{E}[|X_1|]$ and $\mathbb{E}[|C_1|]$ exist. Let $\mathbb{P}[C_i = \pm 1] = 1/2$. Then intuitively the expression $$ \frac{\sum_{i = 1}^{n}X_i \mathbb{1}_{\{C_i = 1\}}}{\sum_{i=1}^{n} \mathbb{1}_{\{C_i = 1\}}} $$ should indeed converge almost surely to $\mathbb{E}[X_1]$, since the probability of $\{C_i = 1\}$ only finitly many times is zero (i.e. you observe $X_i$ infinitely often almost surely).
If this is correct, how can you argue this rigorously?
For general case, you can try Toeplitz lemma:
Under your recreated situation, say $X_{i}$ and $C_{i}$ are independent, the proof becomes much easier. By LLN, with probability 1, you have $$\frac{1}{n}\sum_{i=1}^{n}X_{i}1_{\{C_{i}=c\}}\to\mathbb{E}[X_{i}1_{\{C_{i}=c\}}]=\mathbb{E}[X_{i}]\mathbb{E}[1_{\{C_{i}=c\}}],$$ in the meantime, with probability 1, $$\frac{1}{n}\sum_{i=1}^{n}1_{\{C_{i}=c\}}\to\mathbb{E}[1_{\{C_{i}=c\}}].$$ Since $\mathbb{E}[1_{\{C_{i}=c\}}]=\mathbb{P}(C_{i}=c)$ must be nonzero by Borel–Cantelli lemma, a simple divide yields desired result.