Maximum and average number of inversions in array by induction

13.4k Views Asked by At

Just for your information, an inversion in an array $a$ is any ordered pair of points $(i, j)$ where $i < j$ and $a_i > a_j$.

I can prove the maximum and average number of inversions in an array easily, like so...

The max will be when the array is fully unsorted in which case the number of inversions would be $(n-1) + (n-2) + (n-3)$ all the way down to $(n-(n-1))$. This gets simplified to 0.5(n-1)n maximum inversions. Now the average number of inversions will just be half way between the best and worst case scenario. The worst case scenario is $0.5(n-1)n$ and the best case scenario is $0$ so the average would be $0.25(n-1)n$

Now my question is: how would you go about proving this via induction instead? I'm having trouble knowing what the base case will be and I'm just having trouble with it in general.

1

There are 1 best solutions below

0
On BEST ANSWER

Since you insist on proof by induction, maybe you could try it like this:

Maximum number of inversions

a) Prove by induction that the number of inversions is $\le \frac{n(n-1)}2$.

$1^\circ$ If $n=1$, we have only one permutation which has no inversions.

$2^\circ$ Suppose the claim is true for $n$ elements. If we add $(n+1)$-th element, then we have inversions coming from the first $n$ elements -- there is at most $\frac{n(n-1)}2$ of them, and we also have inversions containing the $(n+1)$-th element -- there is at most $n$ of them. $$\frac{n(n-1)}2 + n = \frac{n(n+1)}2$$ So we obtained the same formula for $(n+1)$ instead if $n$.

b) Prove by induction that the permutation $(n,n-1,\dots,2,1)$ has exactly $\frac{n(n-1)}2$ inversions.
(Note that I am using one-line notation, not cycle notation for permutations.)

$1^\circ$ The permutation $(1)$ has no inversions.

$2^\circ$ The permutation $(n+1,n,\dots,2,1)$ has $n$ inversions containing the element $(n+1)$. The remaining inversion are the same as in the permutation $(n,\dots,2,1)$. So together we have $$\frac{n(n-1)}2 + n = \frac{n(n+1)}2$$ inversions.


Average number of inversions

We claim that average number of inversions is $\frac{n(n-1)}4$.

$1^\circ$ For $n=1$ we only have one permutation, which has no inversions.

$2^\circ$ Let $S_n$ denotes the set of all permutations of $\{1,2,\dots,n\}$ and $i(\varphi)$ denotes the number of inversions of permutation $\varphi$.

We are interested in the average $$A_n = \frac{\sum_{\varphi\in S_n}i(\varphi)}{n!}.$$ We assume that $A_n=\frac{n(n-1)}4$ and we want to calculate $A_{n+1}$.

For a permutation $\varphi$ let us denote $\overline\varphi$ the permutation, in which the order of elements is reversed.

Notice that if $(n+1)$ is in $k$ inversions in $\varphi\in S_{n+1}$, then it is in $(n-k)$ inversions in $\overline{\varphi}$. This yields $$i(\varphi)+i(\overline\varphi)=k+i(\varphi')+(n-k)+i(\overline{\varphi'}) = n + i(\varphi')+i(\overline{\varphi'}),$$ where $\varphi'$ denotes the permutation obtained by omitting the element $(n+1)$. (Note that $k$ depends on $\varphi$. But in the above expression we were working only with one permutation, so we did not denote this by $k_\varphi$ or some other notation stressing this dependence.)

So we get $$ \begin{align} 2(n+1)!A_{n+1} = &\sum_{\varphi\in S_{n+1}}(i(\varphi)+i(\overline{\varphi})) =\\ =&\sum_{\varphi\in S_{n+1}} (n+i(\varphi')+i(\overline{\varphi'})) =\\ =&(n+1)!n + \sum_{\varphi\in S_{n+1}} (i(\varphi')+i(\overline{\varphi'})) \overset{(*)}= \\ =&(n+1)!n + (n+1)\left(\sum_{\varphi'\in S_{n}} i(\varphi')+ \sum_{\varphi'\in S_{n}} i(\overline{\varphi'})\right)=\\ =&(n+1)!n +(n+1) \left(\sum_{\varphi'\in S_{n}} i(\varphi')+ \sum_{\varphi'\in S_{n}} i(\varphi')\right)=\\ =&(n+1)!n + 2(n+1) \cdot n! \cdot A_n =\\ =&(n+1)!n + 2 (n+1)! \cdot A_n \end{align}$$ (In the equation denoted by $(*)$ we have the factor $(n+1)$ since there are $(n+1)$ possibilities how to obtain the same permutation $\varphi'$ from a permutation of $(n+1)$ elements.)

The last equation yields $$A_{n+1} = \frac n2 + A_n = \frac n2 +\frac{n(n-1)}4 = \frac{n(n+1)}2$$


However, if you have a look at other proofs about average number of inversions, the trick we used in this proof (using pairs of permutation and this reverse) can probably relatively easily transformed to a proof avoiding induction completely.


Let me comment on one thing you wrote in your post:

Now the average number of inversions will just be half way between the best and worst case scenario.

It is not necessarily true that average number over the whole sample is always simply arithmetic mean of the worst case and the best case. Just have a look at examples of best/average/worst time complexity on Wikipedia.