Applying the Law of Large Numbers recursively

116 Views Asked by At

If I want to apply the LLN for an estimator that uses another estimator, can I apply the LLN inside the summation and after it simplify the outer summation by using the expected value of the inner one? If yes, why? How can I describe these operations in a more rigorous way?

This comes particularly handy when showing the convergence of a variance estimator

$$ \frac 1 n \sum (x_i - \bar{x}_n)^2 $$ $$ \bar{x}_n = \frac 1 n \sum x_i $$

2

There are 2 best solutions below

0
On

I suppose it depends on the situation; are you looking at something like this; probabilities of probabilities.

Where for example, there a coin that is tossed, and there is a 50 percent chance, that it will have a 40 percent chance of coming up heads. And in addition, there is also a 50 percent chance, that it will have a 30 percent chance, of coming up heads.

And if you are using the strong law of large numbers etc. There are generalizations for this law, such as kolmogorov's Generalized strong law of large numbers which generalizes to, independent but not identically distributed random variables, given that certain variances and conditions are boundedd. . See http://mathworld.wolfram.com/StrongLawofLargeNumbers.html

And perhaps you want to know how this differs, with regard to variance and rates of convergence from a standard biased coin with a fixed chance (of coming up heads being) 0.35 (ie binomial distribution with p=0.35

Moreover, whether one can take the limit of the expected value in the former case (0.35) to make claims like 'almost surely, or almost all the measure (with Pr~=1) is concentrated on sequences whose limiting relative frequency of heads is 0.35 (expected, sample average generating function)sample probably. I have some results on this, using generating functions.

Ie Formally, it appears that in the traditional Kolmogorov (measure theoretical interpretation), whilst the situations are (whilst formally distinguished), the empirical consequence which are entailed by said formalism, appear to be identical and indistinguishable. At least that is what I and one of my co-supervisors have concluded.

This being at least in relation to limiting relative frequencies (within standard probability theory, as I said I am not sure what occurs when one uses a banach spaces real valued random variable approach, or a non-standard hyper-finite formalism).

I suppose it would be interesting to consider the variances and rates of convergence, using a non-standard approach.

Perhaps, with each progressive iteration of the LLN (ie, ' with probability 1,the relative frequency of trials with chance of chance of chance, with probability1, with probability of trials with chance of chance etc,,,, with probability1, with probability1,with probability 1, with probability the limiting rel freq(heads)=0.35) the rate of frequency convergence to the expected value, slows, down or changes,

But presumably this would require an uncountable iteration, of said almost surely's (PR(1)), if the appropriate variance conditions and mean conditions are met each time, although I am not sure. I (can only) presume that in some sense the measure of convergent sequences, is un-countably greater, but this may not be quite correct.

Perhaps, something might change, if there is an infinite (presumably it would have to be uncountable or maybe not)-meta-distribution of probabilities

.Due to the limitations of the sequences themselves being countably infinite, there may not be another elements in the sequence, to have one, or what is more important, an infinite of said elements in each with each appropriate probability value.

0
On

So one may not be able to use the same technique to compute frequency limits within the sequence, the frequency of said trials, for each p, may not limit to infinity, or the relative frequencies of said trials may limit to 0 and there may be issues with countable additivity, at least if they are equi-probable (that is the expected chance of each chance value p of heads, being equal,)

Especially, but not only if one is a strict frequentist.

But that is a different story with a different methodology. They (or at least some frequentists) arguably have worse problems on the one hand to deal with. In this situation at least, and especially if each outcome produces infinitely many or uncountably many outcomes. For example they may require strict arithmetic convergence -at least in a hyper-finite sense- if they wish to avoid certain issues (vis a vis by suscribing to hyper-finite frequentism see

(hajek;2009 philrsss.anu.edu.au/people-defaults/alanh/papers/fifteen.pdf, on this), orThe Oxford Handbook of Probability and Philosophy - (in particular, the chapters by Terrence fine, and the chapter on frequentism)

Whilst on the other hand, frequentists generally consider convergence to be certain and would not even formally distinguish the two cases. Chance/probability just is the limiting relative frequency value) for most frequentists, so the the two cases, would be formally equivalent in all senses.

At least with regard to the combining infinitely many sequences from different collectives into an infinite sequence, such that there are no countable infinitely long sub-sequences,for any given attribute, or chance, which would entail sub-sequence limiting relative frequencies which they might then average.

And without a notion single case chance, to average probabilities to begin with, as probabilities are only defined within a random Von-Mises, ( immune to place selection rules/gambling procedures/after effect) infinite collectives as relative frequencies; it would make no real sense to do