Is there a name for this probabilistic paradox?

277 Views Asked by At

Let $X\sim Exp(1)$ and $Y\sim Exp(\lambda)$, independent. Then, \begin{align} f_{X|Y=mX}(x) = \frac{f_{X,Y}(x,mx) }{\int f_{X,Y}(x,mx) \:dx }=\frac{f_X(x)f_Y(mx) }{\int f_X(x)f_Y(mx) \:dx } = \frac{e^{-(1+\lambda m)x}}{\int e^{-(1+\lambda m)x} dx} = (1+m\lambda)e^{-(1+\lambda m)x} \end{align} So $X|_{Y=mX} \sim Exp(1+m\lambda)$. That means $E[X|Y=mX]=\frac{1}{1+ \lambda m} < 1$ whenever $m>0$.

This makes sense mathematically. $X|_{Y=0}\sim Exp(1)$, $f_{X,Y}(0,0)$ is the same for all $m$, but $f_{X,Y}(x,mx)<f_{X,Y}(x,m'x)$ if $m'<m$.

The practical implication seems weird, though. Your friend gets to your house via one Poisson bus and one Poisson train. You expect someone to wait 1 minute for a train. But they tell you they waited $m$ times as long for the bus than they did for the train, and now you gotta revise your expectation about the train down?

Edit: I think the reason for the (seeming) violation of the Tower property is that I incorrectly defined $f_{X|Y=mX}$. It should instead be

$$ f_{X|Y=mX}(x) = \frac{xf_{X,Y}(x,mx) }{\int x f_{X,Y}(x,mx) \:dx} = \frac{xe^{-(1+\lambda m)x}}{\int xe^{-(1+\lambda m)x} dx} $$ Think about it like the flag of Seychelles: the width of the "ray" is twice as large if you go twice as far out. (Aside: this is indeed Borel's paradox) This means that \begin{align} E[X|Y=mX]= \frac{\int x^2e^{-(1+\lambda m)x} dx}{\int xe^{-(1+\lambda m)x} dx} = \frac{2}{1+\lambda m} \end{align} The distribution of slopes is the distribution of $M:=Y/X$ \begin{align} f_{M}(m) &= \int_0^\infty f_Y(y) f_{1/X}(m/y) y^{-1}\:dy\\ &= \int_0^\infty \lambda e^{-\lambda y}\left(\frac{y}{m}\right)^2 e^{-y/m} y^{-1} \:dy\\ &= \frac{\lambda}{m^2} \int_0^\infty y e^{-y(\lambda + \frac{1}{m}) } \:dy \\ &= \frac{\lambda}{m^2}\frac{1}{(\frac{1}{m} + \lambda)^2}\\ &= \frac{\lambda}{ (1+\lambda m)^2} \end{align} The Law of Iterated Expectaions holds for this definition of the conditional density: \begin{align} \int E[X|Y/X = m] \: dP(M\leq m) = \int_0^\infty \frac{2\lambda}{ (1+\lambda m)^3} \:dm = -\frac{1}{(1+\lambda m)^2}\bigg|_0^\infty = 1 = EX \end{align}

1

There are 1 best solutions below

6
On

The first question is to ask, what are the set of outcomes for which $Y = mX$ for a fixed and known $m$? This is just the set of ordered pairs $$\mathcal S = \{(x,y) \in \mathbb R^2 \mid (x > 0) \cap (y = mx) \}.$$ Visualized geometrically in the Cartesian coordinate plane, this is an (open) ray in the first quadrant that is a subset of the line $y = mx$. The joint density of $X$ and $Y$ is simply $$f_{X,Y}(x,y) = e^{-x} e^{-\lambda y} \mathbb 1(x > 0) \mathbb 1 (y > 0),$$ and so, given that the outcome lies on this ray, the probability density of $X$ must be proportional to

$$f_{X \mid Y = mX}(x) \propto f_{X,Y}(x,mx) = f_X(x) f_Y(mx) = e^{-x} e^{-\lambda m x} = e^{-(\lambda m + 1) x}.$$ This implies the conditional distribution is exponential with rate $\lambda m + 1$, hence its expectation is $1/(\lambda m + 1)$.

Note that your calculation cannot be correct because it does not depend on $\lambda$, but $\lambda$ is informative of $Y$ and in turn, informative of $X$.

I do not see any intrinsic paradox here. The idea is that knowledge that the waiting time of one variable is exactly $m$ times the waiting time of the other, gives you additional information about the outcome of both. A simple way to see this is to look at the discrete analogue, which is a geometric distribution. Suppose $X \sim \operatorname{Geometric}(1/2)$ and $Y \sim \operatorname{Geometric}(1/4)$, where $X$ and $Y$ are independent. For convenience, let the parametrizations have strictly positive support, so their individual means are $2$ and $4$, respectively. Then given that $Y = 2X$, this corresponds to the set of outcomes $$\{(1,2), (2,4), (3,6), \ldots\}$$ and so, the probability mass function of $X$ is given by $$\Pr[X = x \mid Y = 2X] = \frac{\Pr[X = x]\Pr[Y = 2x]}{\sum_{x=1}^\infty \Pr[X = x]\Pr[Y = 2x]} = \frac{(1/2)^x (3/4)^{2x-1} (1/4)}{3/23} = \frac{23}{9} \left(\frac{9}{32}\right)^x,$$ for $x \in \mathbb Z^+$. That is to say, $X$ given that $Y = 2X$ is geometric with parameter $23/32 \ne 1/2$. In this case, we can use the coin-flipping analogy to interpret such a result: we have coin A that is fair, and coin B that has a $1/4$ probability of showing heads. Your friend flips both until the first occurrence of heads, and counts the numbers $(X, Y)$ of tails that were observed for coins $A$ and $B$, respectively. He reports there were twice as many tails for $Y$ as for $X$. This is additional information about $X$ that should affect your idea about the average number of tails for $X$--indeed, it also furnishes additional information about $Y$. In fact, it should be plainly obvious that this information must affect the posterior marginal $Y$, since in the discrete case, $Y$ cannot be odd when $m = 2$.

As an exercise, what is the general case for geometric $X$ and $Y$ with parameters $p_1$ and $p_2$, respectively, and for some positive integer constant $m$?

Another exercise: Is there a choice of $p_1, p_2, m$ such that the posterior for $X$ remains unchanged? Why or why not?