Proving that the runtime of an algorithm which checks for duplicates in an array is $\Theta(n^2)$

90 Views Asked by At

I have been given the following pseudocode:

1. dup(A):
2.   b = FALSE
3.   for i = 0 to n-1 do
4.      for j = i+1 to n-1 do
5.        if A[i] = A[j] then b = TRUE
6. return b

The question relating to this code is split up into two parts: firstly, prove that the algorithm is $O(n)$ and secondly prove that it is $\Theta(n)$. I have given answers in both cases but I'm unsure about their correctness. My proof of part 2 also doesn't involve $\Omega$ so I may have missed the point of the question somehow.

  1. Proof of $O(n)$:

Line $5$ is $\Theta(1)$, and is iterated from $j=i+1$ to $n-1$ by the loop in line $4$, therefore line $5$ is iterated $n-i-1$ times by line $4$.

$\sum\limits_{k=1}^{n-i-1}\Theta(1) = \Theta(\sum\limits_{k=1}^{n-i-1}1)$

Now $\sum\limits_{k=1}^{n-i-1}1 < \sum\limits_{k=1}^{n}1 = n$ so lines $4$-$5$ are $O(n)$. Lines $4$-$5$ are iterated $n$ times by line $3$ so overall the algorithm is $O(n^2)$.

  1. Proof of $\Theta(n^2)$:

For each $i$th iteration of the loop in line $3$, there are $n-i-1$ iterations of line $5$ (from $i = 0$ to $n-1$). So when the loop in line $3$ initiates and $i=0$, there are $n-1$ iterations of line $5$. Similarly when $i=1$ there are $n-2$ iterations of line $5$ and so on. So there are $(n-1) + (n-2) + \dots + (n-(n+1))$ iterations of line $5$ in total. This can be expressed as

$\sum\limits_{k=1}^{n-1}k\Theta(1) = \Theta(\sum\limits_{k=1}^{n-1}k)$ = $\Theta(\frac{n(n-1)}{2}) = \Theta(n^2)$

My proof of part $2$ seems very general and not technical, presuming that it is even correct. Would it be better to provide an answer which proves that the algorithm is also $\Omega(n)$ rather than try to prove the $\Theta(n)$ case generally?

2

There are 2 best solutions below

0
On

First part it unnecessary. If you prove this algorithm runs $\Theta(n^2)$ than you have $O(n^2)$.

0
On

Your proof seems fine to me, and as Konrad noted, part 2 implies part 1. No matter what values are in the array, the algorithm executes the $\Theta(1)$ statement if A[i] = A[j] then b = TRUE the same number of times: once for every pair of integers $(i,j)$ where $0\le i < j < n$. As you found out, there are exactly $\frac{n(n-1)}{2}$ such pairs. What’s outside the loop is $\Theta(1)$, and if you want to look at everything (well, except for messy stuff like virtual memory and caching in case the array is large), the loop overhead to manage iterating values of $i$ and $j$ is also $\Theta(n^2)$, so you’re fine.