I have asked a couple of questions related to statistics recently as I just started to study the topic again (I ignored my university course on statistics and I now eat my fingers in anger).
I asked this question recently, and even though I do appreciate all the answers that people made the effort to provide me with, I am still puzzled about the whole thing:
Monte Carlo integration, expected value of the sample mean and expected value of f(x)
I also studied this very good video and example from Khan Academy:
I tried to interpret the data that are generated from the exercise (which you can see in the video). In short what the exercise does is generating a population (it creates groups of random size where each of the generated group holds a number between 1 to 20). The sum of the groups size gives the population size. From this population we can compute population mean and variance. All good. Then we sample the population X times where the sample size varies from 2 to 20. The sample mean and variance is computed for each sample, and then plotted (sample variance as a function of sample mean). Again this is all good.
I would like to know if I interpret (intuitively) the data correctly:
this exercise shows clearly to me the difference between "estimation" and "approximation". You can see that even with sample size whose size is small (say 2) we can get a good estimation of the population mean and variance. Of course this happens by "chance" but the probability that this happens exists.
now the problem I am debating with a student friend of mine is this. I tell him that it is not because the "size" of the sample increases, that the probability of that the sample to give a "better" estimate (compared to a sample with a smaller size) increases accordingly! My argument is that, as the size of the sample increases, the results of the sample "converge" to the population's parameters due to the law of large numbers. But the probability that you get a value close to this population parameters because the sample size is either 2 or 20 doesn't really change.
I feel I am be right and wrong at the same time. But I like this explanation because it seems to make it possible to explain 2 distinct phenomena from 1 single set of data.
1) that sample gives an estimate (and not an approximation) of the population's parameters. The less samples, the more likely the estimate is to be way off from the population parameters. That still suggests though that there's some relation between number of sample and probability of getting an estimate close to the population parameter. But I don't know which one I can establish and if it is accurate to say so?
2) however increasing the size of the samples, can't be seen as a guarantee of of getting a better estimate from a probability point of view. It gives us the guarantee though that we get an estimate that converges to the exact value because of the LLN (then this looks almost more like an approximation than an expectation to me).
It would be great if someone could tell me if I am on the right path or not (and if not correct me). I would really like to understand 1) how to interpret the results 2) where is the line between estimators and the LLN.
Thank you so much for your time and knowledge.
EDIT: reading the Wikipedia page on the CLT. It says "By the law of large numbers, the sample averages converge in probability and almost surely to the expected value µ as n → ∞." So I assume this is where the relationship is. If it converges in probability it means the probability of getting the population parameters increases as n increases. Could someone please confirm this is right?