Probability that kth order statistic is s given one of the values is s

77 Views Asked by At

Let $U_1, \cdots, U_n \overset{\text{iid}}{\sim} \textrm{Unif}[0, 1]$ and $0 < s < 1$. Also, let us denote their order statistics as $U_{(1)},\cdots, U_{(n)}$.

I would like to compute the following conditional probability: $$ \textrm{Pr}\left[U_{(k)} = s | s \in \{ U_1, \cdots, U_n \}\right] $$

My attempt has been to come up with a recursive formula for this probability, by considering in which region $U_n$ lies, that is:

$$ \begin{align} \textrm{Pr}\left[U_{(k)} = s | s \in \{ U_1, \cdots, U_n \}\right] &=\textrm{Pr}[U_{(k)} = s, U_n > s | s \in \{ U_1, \cdots, U_{n-1} \}] \\ &\,+ \textrm{Pr}[U_{(k)} = s, U_n = s | s \in \{ U_1, \cdots, U_{n} \}] \\ &\,+ \textrm{Pr}[U_{(k)} = s, U_n < s | s \in \{ U_1, \cdots, U_{n-1} \}] \end{align} $$

However, I could not derive meaningful results from this, mostly because I have never worked with conditional probabilities where the condition has 'zero probability', as in this case where a single value is given to have appeared in the set of continuous random variables.

Thank you.

1

There are 1 best solutions below

1
On BEST ANSWER

Let us write $\mathcal{U} = \{ U_1, \cdots, U_n \}$ for simplicity. I will assume that $\mathbf{P}[ \,\cdot\, \mid s \in \mathcal{U}]$ is defined via the weak limit

\begin{align*} \mathbf{P}[ \cdot \mid s \in \mathcal{U} ] &= \lim_{\varepsilon \to 0^+} \mathbf{P}[ \cdot \mid \mathcal{U} \cap \mathcal{B}_{\varepsilon}(s) \neq \varnothing ], \end{align*}

where $\mathcal{B}_{\varepsilon}(s) = (s-\varepsilon, s+\varepsilon)$ is the open interval of radius $\varepsilon$ about $s$.

Remark. Like what @Andrew Zhang mentioned in the comments, this is by no means a canonical way of defining the conditional distribution (and we do not expect such thing to exist). Whether this particular implementation of $\mathbf{P}[ \,\cdot\, \mid s \in \mathcal{U}]$ makes sense and/or is useful in OP's setting or not, hinges on how it is going to be used in OP's context.

In order to compute the law of $U_{(k)}$ under $\mathbf{P}[ \,\cdot\, \mid s \in \mathcal{U}]$, let us consider an arbitrary smooth test function $h \in \mathcal{C}^{\infty}([0, 1])$. Then we get

\begin{align*} \mathbf{E}[ h(U_{(k)}) \mid \mathcal{U} \cap \mathcal{B}_{\varepsilon}(s) \neq \varnothing ] &= \frac{\mathbf{E}[ h(U_{(k)}) \mathbf{1}_{\{\mathcal{U} \cap \mathcal{B}_{\varepsilon}(s) \neq \varnothing\}} ]}{\mathbf{P}[ \mathcal{U} \cap \mathcal{B}_{\varepsilon}(s) \neq \varnothing ]} \\ &= \frac{\sum_{j=1}^{n} \mathbf{E}[ h(U_{(k)}) \mid U_{(j)} \in \mathcal{B}_{\varepsilon}(s) ] \mathbf{P}[U_{(j)} \in \mathcal{B}_{\varepsilon}(s)] + \mathcal{O}(\varepsilon^2)}{\sum_{j=1}^{n} \mathbf{P}[U_{(j)} \in \mathcal{B}_{\varepsilon}(s)] + \mathcal{O}(\varepsilon^2)} \\ &= \frac{\sum_{j=1}^{n} \mathbf{E}[ h(U_{(k)}) \mid U_{(j)} = s ] f_{U_{(j)}}(s) + \mathcal{O}(\varepsilon)}{\sum_{j=1}^{n} f_{U_{(j)}}(s) + \mathcal{O}(\varepsilon)}. \end{align*}

By letting $\varepsilon \to 0^+$, this reduces to

\begin{align*} \mathbf{P}[ h(U_{(k)}) \mid s \in \mathcal{U} ] &= \frac{\sum_{j=1}^{n} \mathbf{E}[ h(U_{(k)}) \mid U_{(j)} = s ] f_{U_{(j)}}(s)}{\sum_{j=1}^{n} f_{U_{(j)}}(s)} \\ &= \frac{1}{\sum_{j=1}^{n} f_{U_{(j)}}(s)} \biggl[ \sum_{j\in[n]\setminus\{k\}} \int_{0}^{1} h(u) f_{U_{(k)},U_{(j)}}(u, s) \, \mathrm{d}u + h(s) f_{U_{(k)}}(s) \biggr] \end{align*}

In particular, this shows that the law of $U_{(k)}$ under the law $\mathbf{P}[ \cdot \mid s \in \mathcal{U} ]$ is of mixed type with

$$ \mathbf{P}[ U_{(k)} = s \mid s \in \mathcal{U} ] = \frac{f_{U_{(k)}}(s)}{\sum_{j=1}^{n} f_{U_{(j)}}(s)}. $$