I am reading this paper on Spherical locality-sensitive hashing where the authors claim the following:
Suppose we sample $U=2^{\Theta(\sqrt{n})}$ vectors $s_1, \dots, s_U \in \mathbb{R}^{n}$ from an $n$-dimensional Gaussian distribution with average norm $\mathbb{E}[\|s_i\|]=1$. This equivalently corresponds to drawing each vector entry from a univariate Gaussian distribution $\mathcal{N}(0, 1/n)$. To each $s_i$ we associate a hash region $$ H_i=\left\{v \in S^{n-1}:\langle v, s_i \rangle \geq n^{-1 / 4}\right\} \setminus \bigcup_{j=1}^{i-1} H_j $$ Since we assume that $v\in S^{n-1}$ and w.h.p we have $\|s_i\| \approx 1$, the condition $\langle v, s_i \rangle \geq n^{-1 / 4}$ is equivalent to $\|v-s_i\| \leq \sqrt{2}-\Theta(n^{-1/4})$, i.e., $v$ lies in the almost-hemisphere of radius $\sqrt{2}-\Theta(n^{-1/4})$ defined by $s_i$. Note that the parts of $S^{n-1}$ that are covered by multiple hash regions are assigned to the first region $H_i$ that covers the point. As a result, the size of hash regions generally decreases with $i$. Also note that the choice of $U=2^{\Theta(\sqrt{n})}$ guarantees that with high probability, at the end the entire sphere is covered by these hash regions $H_1, H_2,\dots, H_U$; informally, each hash region covers a $2^{-\Theta(\sqrt{n})}$ fraction of the sphere, so we need $2^{\Theta(\sqrt{n})}$ regions to cover the entire hypersphere
I can understand everything, except for the last sentence which I have made bold. I can see why $H_i$'s are decreasing but I can't reason about the next part. Perhaps it has something to do with how each $s_i$ is sampled but I am not sure.
Assuming (for simplicity) that $\|s_i\| = 1$ we can think of each $H_i$ as a patch covering some fixed area of the hypersphere "centered" on $s_i$. So maybe another formulation is to find the expected number of samples $s_i$ we need to cover the whole hypersphere with those patches? For example, in the case $n=2$ we can think of it as covering the unit circle with arcs.
Any help is highly appreciated!