Resource for understanding mathematics behind multi-arm bandits

76 Views Asked by At

I have been recently trying to understand the proofs related to convergence and regret analysis of multi-arm bandits. These proofs seem to use a variety of mathematical skills such as measure theory, sampling statistics, convergence bounds, etc.

I am trying to understand papers like

  1. http://proceedings.mlr.press/v23/agrawal12/agrawal12.pdf
  2. http://www.jmlr.org/papers/volume3/auer02a/auer02a.pdf
  3. https://arxiv.org/pdf/1204.1909.pdf
  4. http://www.economics.uci.edu/~ivan/asmb.874.pdf

I know probability and statistics basics upto under-grad level. However, I am facing difficulty in completely understanding their proofs. Any resource such book, or course I should know before understanding these proofs would be highly helpful.

Thanks.