There is an algorithmic analysis of maximum element finding algorithm in Dr. Knuth's vol 1 book. There he has analyzed the following pseudo code:
int main() {
// array[n] is initialized with same random values
// n can be large
// assume n = 9
int array[n] = {2,3,-4,1,0,6,2,3,4};
int j,m;
//initialize
j = n-1; //assign last index to j
m = arr[n-1]; //assign last value to m
int k = n-2; //assign 2nd last index to k
//loop
while(k>=0) { // stop the loop is k reaches the first element
if(arr[k] <= m) { // m is the current maximum value
k--;
continue; // array[k] is less than current maximum
// So, we we need to go back and decrement k
}else { // else block
j = k; // found new maximum index
m = arr[k]; // found new maximum value
}
}
printf("%d %d\n",m,j);
return 0;
}
// n can be large enough with n randomly initialized values.
The concerned question was
how many times on average the else block executes ?
He has explained it using generating function and finally found the result to be
$$H_n - 1$$
I convinced to myself using the following expected value method :
Let $E(n)$ be the expected no of times the else block executes.
Now;
- if 1st last element is not max then $E(n) = E(n-1)$.
if 1st last element is max then $E(n) = E(n-1) + 1$.
1st last element can be max with probability $\frac{1}{n}$ and probability that it can not be the max is $\frac{n-1}{n}$.
So,
$$\begin{align*} E(n) &= \left [ E(n-1) +1 \right ]\left ( \frac{1}{n} \right ) + \left [ E(n-1) \right ]\left ( \frac{n-1}{n} \right ) \\ &= E(n-1) + \frac{1}{n} \end{align*}$$
Using recursion tree ; complexity of $ \ E(n) = O(H_n) \approx O(\log n)$ for large value of n.
Mathematically it is fine to have this answer as $\log n$, but what does it mean actually ?
If we simply take the increasing numbers, then else block will executes zero time and for decreasing sequence else block executes n times. So the average no of times else block will run seems to be $\frac{n}{2}$. Now we know this is wrong, why not all cases are equality likely ? I couldn't figure out the reason why is it logarithmic bounded.
First, I really think it would be more appropriate to post this in cs.stackexchange.com.
The question is, having any possible combination of random numbers in
arr, what is the average number of times that the algorithm goes inside theelseblock.There are some cases, where the algorithm does not go into
elseblock at all (for example, whenarris sorted in decreasing order); There are also cases where the algorithm goes intoelseblock $n-1$ times (for instance, whenarris sorted in increasing order); Similar to these extreme cases, there could be cases where the algorithm goes into that block $2, \cdots, n-2$ times. But the question is which ones are more likely to happen? The answer is found using combinatorics techniques. On average (given any possible combination of numbers), the algorithm goes into theelseblock $\log n$ times. Because the probability that the the algorithm visitselseonly few times, is much higher than the probability of going into that block a large number of times.You need to note that the number of times that the algorithm needs to go into
elseblock only 1 times is much more likely than $n-1$ times. Similarly with 2 and etc. As a very simple example, consider those extreme cases which we both discussed. The number of times that the algorithm does not go insideelseblock is $(n-1)!$ times, while the number of times that the algorithm goes insideelseblock $n-1$ times is just 1.