Question regarding Monoalphabetic Phi Test

37 Views Asked by At

I've been asked to prove the following system of inequalities;

$$1 \ge \phi(T) \ge \frac{n-k}{k(n-1)}$$

Where $\phi(T) = \sum_{i=1}^{k} \frac{n_i (n_i -1)}{n(n-1)}$, $T =$ some text, $n = $ length of text $T$, $k = $ size of the alphabet, and $n_i =$ the frequency of a letter from the alphabet within the text.

Finding the equalities was relatively easy; they exist when the size of the alphabet, $k$, is equal to one.

Where I'm having trouble is proving the inequalities. I can pretty much see that, due to $\phi(T)$ being a sum of frequencies, it'll always be less than one, but should I be proving this with induction?? And further, I can't really see how to prove that $\phi(T) \gt \frac{n-k}{k(n-1)}$. Again, I can see how it works, and I've done it with a few examples, but I'm not too sure how to translate that into a rigorous proof.

1

There are 1 best solutions below

0
On BEST ANSWER

This follows immediately from the convexity of $x(x-1)$.

The minimum is achieved when all variables are equal, ie $n_i = \frac n k $. The maximum is achieved at the end points, aka one value equals $n$ and the rest are 0