$O(n)$ algorithm to determine a number that appears more than $n/2$ times in an array of size $n$

1.8k Views Asked by At

I am trying to determine $O(n)$ algorithm to determine a number that appears more than $n/2$ times in an array of size $n$, assuming that such a number does exist in the array without resorting to an algorithm that computes the median.

This problem came with a hint stating that

Property 1: if an array of size $n$, $A := [1,...,n]$ has an element that appears more than $n/2$ times and $A[1] \neq A[2]$ then x will have the same property of appearing more than $ \frac{n-2}{2} $ in the sub-array $B:=[3,...,n] $. This property is to be used in determining the desired algorithm.

I proved this as follows: Suppose that $x$ does not appear in the first two elements of A, then it must appear $n/2$ times in the remaining $n-2$ elements, thus $x$ will appear more than $ \frac{n-2}{2} $ times. Now suppose that $x=A[1]$, (the case of $x=A[2]$ being similar), then $x$ must appear more than $\frac {n}{2} - 1 = \frac {n-2}{2}$ times. So either way Property $1$ is satisfied.

Thoughts:This technique could be used to continually shorten the number of elements in the array until there is only one element, but it works only if $A[i] \neq A[i+1]$. Here is where I am stuck. Any hints on how to proceed much appreciated.

1

There are 1 best solutions below

3
On BEST ANSWER

The hint is a special case of the following observation: Given an array $A = a_1a_2\cdots a_n$ with an element $a^*$ occurring more than $n / 2$ times, if we remove two elements $a_i$ and $a_j$ with $a_i \neq a_j$ from $A$, $a^*$ is still a majority element among the remaining elements.

Based on this observation, an algorithm with $\mathcal{O}(n)$ time and $\mathcal{O}(1)$ space exists. Below is the sketch.

  1. $count \leftarrow 1$;

  2. $candidate \leftarrow a_1$;

  3. for $i \leftarrow 2$ to $n$:

  4. $\quad$ if $count = 0$:

  5. $\quad$ $\quad$ $candidate \leftarrow a_i$;

  6. $\quad$ $\quad$ $count \leftarrow 1$;

  7. $\quad$ else if $a_i \neq candidate$:

  8. $\quad$ $\quad$ $count \leftarrow count - 1$;

  9. $\quad$ else:

  10. $\quad$ $\quad$ $count \leftarrow count + 1$;

  11. return $candidate$;

The algorithm maintains after each iteration, $count$ elements (i.e., $candidate$) are unmatched. Therefore, when algorithm terminates, $candidate$ is the majority element.