I need a clarification about the notation used in the main theorem of the proof "Balls into Bins" - A Simple and Tight Analysis
The theorem states that:
Let $M$ be the random variable that counts the maximum number of balls into any bin, if we throw $m$ balls independently and uniformly at random into $n$ bins. Then $Pr[M > k_\alpha] = o(1)$ if $\alpha > 1$ and $Pr[M > k_\alpha] = 1 - o(1)$ if $ 0 < \alpha < 1 $, where
$k_\alpha = \frac{m}{n} + \sqrt{\frac{2m \log_n}{n}(1 - \frac{1}{\alpha}\frac{log^{(2)} n}{2 \log n})}$, if $ m >> n (log n)^3$
I skipped the result for other cases of m because I want to ask the meaning of notations. I come from a Comp Sci background. I don't understand about the interpretation of using the small-o notation in probability: $Pr[M > k_\alpha] = o(1)$.
I also quickly checked the wikipedia https://en.wikipedia.org/wiki/Big_O_in_probability_notation. But it does not help. They are walking about a set of Random Variables. Is the "Balls into Bins" talking about a set when it said $M > k_\alpha$?
Thanks
Perhaps I can help shed some light.
Little-o notation is used to compare the behavior of two functions $f$, $g$ when inputs are arbitrarily large: it's merely a tool used in the study of asymptotics.
In a non-probabalistic setting what we have is $$ f(n) = o(g(n)) \iff \lim_{n \to \infty} \frac{f(n)}{g(n)}=0 $$
which describes a situation where the function $f$ grows much much slower than the function $g$ as inputs approach infinity. An example: $x = o(x^2)$ since $\frac{x}{x^2} \to 0$ as $x \to \infty$.
As to your specific case: unfortunately their notation is a bit sloppy. What is presumed in the context of the problem is that we let the number of bins $n \to \infty$.
Define $M_n$ as a random variable representing the maximum number of balls in any bin when we have exactly $n$ bins. In the case when $\alpha>1$ and there are WAYY more balls than bins i.e. $m >> n (\log n) ^3$ what we have is:
$$P(M_n > k_{\alpha}) = o(1)$$ which means that $$\lim_{n\to \infty} P(M_n > k_{\alpha}) =0$$ so there is a vanishing chance that we find a bin with more than $k_{\alpha}$ balls in as the number of bins approaches infinity.
The case where $\alpha <1$ and $m >> n (\log n )^3$ can be rewritten as $$1-P(M_n>k_{\alpha}) = P(M_n \leq k_{\alpha}) = o(1)$$
which means that we will almost certainly find a bin with at least $k_{\alpha}$ balls in it if we have arbitrarily many bins.
Your interpretation is then dependent on $\alpha$ and the definition of $k_{\alpha}$. Consider that the first term $\frac{m}{n}$ in $k_{\alpha}$ represents the expected number of balls in any given bin then interpret the square root business as a function of $\alpha$.