Order Statistic, independence and verification with simulation

100 Views Asked by At

I have three exponentially distributed random variables $X_2,X_3,X_4$ distributed with rates $\lambda_2=2,\lambda_3=3,\lambda_4=4$. Im trying to find the distribution of $X_3$ given that $X_3$ is the $i^{th}$ order statistic. When I simulate this I get the following results for $X_3 > 0.1$:

>>> import numpy as np
>>> n2 = np.random.exponential(1/2.0,1e5)
>>> n3 = np.random.exponential(1/3.0,1e5)
>>> n4 = np.random.exponential(1/4.0,1e5)
>>> len(reduce(np.intersect1d, (np.where(n3<n4)[0],np.where(n3<n2)[0],np.where(n3 > 0.1)[0])))
13689

>>> (3.0/9.0)*np.exp(-9*0.1)
0.13552321991353303

There are two aspects of this that I don't understand.

  1. Since $X_4$ and $X_2$ are independent, can't this be broken down into the product of two terms $\frac{\lambda_3}{\lambda_3+\lambda_2}$ and $\frac{\lambda_3}{\lambda_3+\lambda_4}$. I know that that the $\frac{3}{9}$ comes from the minimum of random exponential random variables which means that these two terms are not in fact independent. But where does this dependence come from?

  2. I get where the form of $\exp(-9*0.1)$ comes from, but why is this 9 and not 3. Why did the term np.where(n3<0.1) go from an exponential distributed RV with rate 3 to rate 9?

My current answer to these questions is that my written interpretation of the code isn't what is actually going on. What is the correct way to interpret the code I have written?

1

There are 1 best solutions below

3
On BEST ANSWER

What is the correct way to interpret the code I have written?

Your program correctly simulates the following probability (if you divide your first output by the number of samples, to obtain the relative frequency): $$\begin{align} &P(\{X_4>X_3\}\cap\{X_2>X_3\}\cap\{X_3>0.1\})\\ &=\int_{-\infty}^\infty P(\{X_4>X_3\}\cap\{X_2>X_3\}\cap\{X_3>0.1\}\mid X_3=t)\,\rho_{X_3}(t)\,dt\\ &=\int_{-\infty}^\infty P(\{X_4>t\}\cap\{X_2>t\}\cap\{t>0.1\}\mid X_3=t)\,\rho_{X_3}(t)\,dt\\ &=\int_{0.1}^\infty P(\{X_4>t\}\cap\{X_2>t\}\mid X_3=t)\,\rho_{X_3}(t)\,dt\\ &=\int_{0.1}^\infty P(X_4>t)\,P(X_2>t)\,\rho_{X_3}(t)\,dt\\ &=\int_{0.1}^\infty e^{-4t}\cdot e^{-2t}\cdot 3e^{-3t}\,dt=3\int_{0.1}^\infty e^{-9t}dt=\frac{3}{9}\int_{9(0.1)}^\infty e^{-u}du \\ &=\frac{1}{3}\big[-e^{-u}\big]_{9(0.1)}^\infty\\ &=\frac{1}{3}e^{-9(0.1)}. \end{align}$$

In the code

len(reduce(np.intersect1d, (np.where(n3<n4)[0],np.where(n3<n2)[0],np.where(n3 > 0.1)[0])))

the part

reduce(np.intersect1d, (np.where(n3<n4)[0],np.where(n3<n2)[0],np.where(n3 > 0.1)[0]))

collects all the indices $i\in\{1,...,10^5\}$ for which all three of the following conditions are satisfied:

n3[i] < n4[i]
n3[i] < n2[i]
n3[i] > 0.1

then len returns the number of such indices that were found. Thus, dividing this number by $10^5$ will give a relative frequency that estimates the above probability.


Im trying to find the distribution of $X_3$ given that $X_3$ is the $i$th order statistic.

You seem to be looking first at the case of $X_3$ being the first order statistic, i.e. when $X_3=\min\{X_2,X_3,X_4\}$. For example, you might want to simulate $$\begin{align} &P(\{X_3 > 0.1\}\mid \{X_3=\min\{X_2,X_3,X_4\}\}\\ \\ &=P(\{X_3 > 0.1\}\mid \{X_3<X_4\}\cap\{X_3<X_2\}\\ \\ &=\frac{P(\{X_3 > 0.1\}\cap\{X_3<X_4\}\cap\{X_3<X_2\})}{P(\{X_3<X_4\}\cap\{X_3<X_2\})}\\ \end{align}$$

where the numerator is just the probability that you've already simulated, and the denominator is as follows (using the same methods as before):

$$\begin{align} &P(\{X_3<X_4\}\cap\{X_3<X_2\})\\ &=\int_{-\infty}^\infty P(X_4>t)\,P(X_2>t)\,\rho_{X_3}(t)dt\\ &=\int_{0}^\infty e^{-4t}\cdot e^{-2t}\cdot 3e^{-3t}\,dt=3\int_{0}^\infty e^{-9t}dt=\frac{3}{9}\int_{0}^\infty e^{-u}du \\ &=\frac{1}{3}\big[-e^{-u}\big]_{0}^\infty\\ &=\frac{1}{3}. \end{align}$$

Therefore, $$P(\{X_3 > 0.1\}\mid \{X_3=\min\{X_2,X_3,X_4\}\})=e^{-9(0.1)}.$$

To simulate this, you would only need to adjust your code to something like the following:

numer = len(reduce(np.intersect1d, (np.where(n3<n4)[0],np.where(n3<n2)[0],np.where(n3 > 0.1)[0])))
denom = len(reduce(np.intersect1d, (np.where(n3<n4)[0],np.where(n3<n2)[0])))
print(float(numer)/float(denom))

NB: This latter probability is the same as $P(\min\{X_2,X_3,X_4\}>0.1),$ which could also be simulated by simply counting how often it happens that min(n2[i],n3[i],n4[i])>0.1, etc.