Distribution of $X_{(n)}$ - Two different answers in two different approaches

65 Views Asked by At

Let $X_1,\ldots,X_n \sim$ iid discrete uniform $\{1,\ldots,n\}$. We want to compute the distribution of $$X_{(n)}:=\max{\{X_1,\ldots,X_n\}}$$

Approach 1 Direct mass function calculation

$$P\{X_{(n)}=y\}=n \cdot \frac{1}{N} \cdot \left(\frac{y}{N}\right)^{n-1}$$

Approach 2 Via distribution function

$$P\{X_{(n)}=y\}=P\{X_{(n)} \leq y\} - P\{X_{(n)} \leq y-1\}=\left(\frac{y}{N}\right)^n - \left(\frac{y-1}{N}\right)^n$$

I'm getting two different answers in 2 different approaches!!!! I know that the 2nd answer is correct, but what is wrong with approach 1?

1

There are 1 best solutions below

0
On BEST ANSWER

In approach 1 you suppose that one of $X_1,\ldots, X_n$ can take value $y$ and the rest values are less or equal than $y$. But the events $$ A_1=\{X_1=y, X_2\leq y,\ldots,X_n\leq y\}, $$ $$ A_2=\{X_2=y, X_1\leq y, X_3\leq y,\ldots,X_n\leq y\}, $$ $$\ldots$$ $$ A_n=\{X_n=y, X_1\leq y, \ldots,X_{n-1}\leq y\} $$ are not mutually disjoint. Indeed, the event $\{X_1=\ldots=X_n=y\}$ is a subset of all these events. Therefore $$ \mathbb P\{X_{(n)}=y\}=\mathbb P\{A_1 \cup\ldots\cup A_n\}\color{red}{\neq} \mathbb P\{A_1\}+\ldots+\mathbb P\{A_n\}=\underbrace{\frac{1}{N}\left(\frac{y}{N}\right)^{n-1}+\ldots+\frac{1}{N}\left(\frac{y}{N}\right)^{n-1}}_n. $$ If you want to use approach 1, you either can use inclusion exclusion formula or separate the values are equal $y$ from the values which are less that $y$. Say, let $$ B_k=\{X_{(n-k)}\leq y-1, X_{(n-k+1)}=\ldots=X_{(n)}=y\}. $$ This event means that exactly $k$ values in $X_1,\ldots,X_n$ are equal to $y$, and the rest $n-k$ values are strictly less than $y$. Then $$ \mathbb P\{X_{(n)}=y\}=\mathbb P\{B_1 \cup\ldots\cup B_n\} = \sum_{k=1}^n \mathbb P\{B_k\} = \sum_{k=1}^n \binom{n}{k} \left(\frac{1}{N}\right)^k \left(\frac{y-1}{N}\right)^{n-k}. $$ Note that this answer coincides with the answer in approach 2.