I would to discuss a proof found in CLRS book please.
Theorem: In a hash table in which collisions are resolved by chaining, a successful search takes time $\Theta(1 + \alpha)$, on the average, under the assumption of simple uniform hashing.
Proof: Assume we $n$ elements are equally likely to be searched in the table. The number of elements examined during a successful search for an element $x$ is 1 more than the number of elements that appear before $x$ in $x$’s list. New element will be added at the end of the list. So, in each slot, we would have a list in case of collision and this is how chaining resolves collision.
To find the expected number of elements examined, we take the average, over the n elements $x$ in the table, of 1 plus the expected number of elements added to $x$’s list after x was added to the list. Let $x_i$ denote the ith element inserted into the table, for $i = 1, 2, \cdots, n$ and let $k_i = key[x_i]$. For keys $k_i$ and $k_j$, we define the indicator random variable $X_{ij} = I \{h(k_i) = h(k_j)\}$. Under the assumption of simple uniform hashing, we have $Pr\{h(k_i) = h(k_j)\} = 1/m$, $E [X_{ij}] = 1/m$. Thus, the expected number of elements examined in a successful search is
$$ \begin{gather} E\left[ \frac{1}{n} \sum_{i=1}^{n} \left( 1 + \sum_{i=1}^{n} X_{ij} \right)\right] \tag{1} \label{1}\\ =\frac{1}{n} \times \left[\sum_{i=1}^{n} \left( 1 + \sum_{i=1}^{n} E\left[X_{ij}\right] \right)\right] &\text{(Based on Linearlity of Expectation)}\\ \vdots \\ =\Theta(1+\alpha) \end{gather} $$
Problems:
- The number of elements examined during a successful search for an element $x$ is 1 more than the number of elements that appear before $x$ in $x$’s list. So, if our table is empty and we have only 1 element, then the time needed would be $O(1)$. If all elements are hashed to first entry of the table, then we would need $O(Elements)$ to find the item as I understand. Why the time needed is $\Theta(1+\alpha)$ please?
- How $\frac{1}{n} \sum_{i=1}^{n} \left( 1 + \sum_{i=1}^{n} X_{ij} \right)$ in Equation \ref{1} was brought in first place please? From what I understood, since all elements to be found are equally likely, then we would have $\frac{1}{n}$, so probably this is why its there in \ref{1}.