Using inequality $\sum_{k-2}^{n-1} k \log k \leq \frac{1}{2} n^2 \log n - \frac{1}{8} n^2$ to show the recurrence of quicksort runtime

94 Views Asked by At

Use the inequality $$\sum_{k=2}^{n-1} k \log k \leq \frac{1}{2} n^2 \log n - \frac{1}{8} n^2$$ To show that $$C_n = \frac{2}{n}\sum_{i=2}^{n-1}C_i + \Theta(n)$$ is $\Theta(n \log n)$

This is an exercise for analyzing the runtime of randomized quicksort. I understand other ways to prove that randomized quick sort has an average case of $\Theta(n \log n)$ but I don't know how to prove it by the recurrence and inequality above. I have tried substitution like the following:

\begin{align*} C_n &= \frac{2}{n}\sum_{i=2}^{n-1} \left(\frac{2}{i}\sum_{j=2}^{i-1} C_j + \Theta(i) \right) + \Theta(n) \end{align*}

and I feel like repeated applying this substitution should be the way to proceed but it would require a variable amount of summation symbol which I am not sure how to express.

Also, I have tried subtracting $C_{n-1}$ from $C_{n}$. It could be a way to solve the question by assuming $\Theta(n) = n$ (which is true in quick sort) but it doesn't seem like i can use the inequality above in this method.

1

There are 1 best solutions below

1
On BEST ANSWER

This is essentially asking you to prove it by substitution: you want to prove that $C_n = O(n\log n)$ is a solution to the recurrence. You'll do that by induction.

Your induction hypothesis (IH) is $C_k \leq c\cdot k\log k$, for some absolute constant $c>0$ to be determined. The $\Theta(n)$ in the recurrence relation for $C_n$ should be replaced by $c' n$ for some constant (which you don't have control over).

  • Base case: $C_2 \leq c\cdot 2\log 2$ as long as the constant $c>0$ is chosen greater than $C_2/(2\log 2)$. [good]
  • Induction step: this is where you'll be using that fact. Assume the IH holds for all $2\leq k< n$. Then $$\begin{align*} C_n &\leq \frac{2}{n}\sum_{k=2}^{n-1} C_k + c' n \tag{recurrence relation}\\ &\leq \frac{2 c}{n}\sum_{k=2}^{n-1} k\log k + c' n \tag{IH} \\ &\leq c n\log n - \frac{c}{4}n + c' n \tag{using the fact}\\ &\leq c n\log n \end{align*}$$ the last inequality as long as we have $c\geq 4c'$ (so that the added term is non-negative).

By induction,this shows that as long as we choose $c \geq \max(C_2/(2\log 2, 4c')$, $\boxed{C_n \leq c n\log n}$.