Let $n_i$ be the random variable denoting the number of elements placed in bucket $B[i]$ and let $X_{ij}$ be the indicator variable $\mathbb{I}\{A[j] \text{ falls in bucket } i\}$.
Thus, for each $i$, defining $n_{i} := \sum^{n}_{j=1}{X_{ij}}$, we have
$$\mathbb{E}[n^2_i] = \mathbb{E}\left[\left(\sum^{n}_{j=1}{X_{ij}}\right)^2\right].$$
Then I got confused by the two equations below. I don't know how they had this
- $\mathbb{E}[n^2_i] = \mathbb{E}\left[\left(\sum^{n}_{j=1}{X_{ij}}\right) \left(\sum^{n}_{k=1}{X_{ik}}\right)\right]$
- $\mathbb{E}[n^2_i] = \mathbb{E}\left[\sum^{n}_{j=1}X_{ij}^2 + \sum_{1\leq j\leq n}{X_{ij}}\sum_{1\leq k\leq n,j \neq k}{X_{ik}}\right]$
Please can someone explain to me how they manage to get the two equations above
The probability backdrop of this is actually irrelevant, as this is purely about symbolic algebra. First, $$ \left(\sum^{n}_{j=1}{X_{ij}}\right)^2=\left(\sum^{n}_{j=1}{X_{ij}}\right)\left(\sum^{n}_{j=1}{X_{ij}}\right)=\left(\sum^{n}_{j=1}{X_{ij}}\right)\left(\sum^{n}_{k=1}{X_{ik}}\right) $$ where the 2nd equality above is just changing the dummy counter from "$j$" to "$k$" for the second summation. Second, as for why: $$ \left(\sum^{n}_{j=1}{X_{ij}}\right)\left(\sum^{n}_{k=1}{X_{ik}}\right)=\sum^{n}_{j=1}X_{ij}^2 + \sum_{1\leq j\leq n}{X_{ij}}\sum_{1\leq k\leq n,j \neq k}{X_{ik}} $$ you just have multiply out the LHS: $$ \left(\sum^{n}_{j=1}{X_{ij}}\right)\left(\sum^{n}_{k=1}{X_{ik}}\right)=\sum^{n}_{j=1}\left[{X_{ij}}\left(\sum^{n}_{k=1}{X_{ik}}\right)\right]=\sum^{n}_{j=1}\left[X_{ij}^2+X_{ij}\left(\sum_{1\leq k\leq n, k\neq j}{X_{ik}}\right)\right]. $$