Finding optimal (=lowest cost) roads connecting all cities where cost of roads are iid exponentially distributed (minimum spanning tree)

44 Views Asked by At

I’m trying to solve/understand some exercises about problems involving exponential distributions.

The professor in the following video (https://www.youtube.com/watch?v=DfROPYAjbfM&list=PLV3oHJg9b1NRk4_LKUdqXPoN9jOWRypKI&index=48), presents the so-called “Minimum Spanning Tree” problem, which says:

There are $n$ cities and roads must be constructed so that every pair of cities is connected. Assume that all the $\binom{n}{2}$ roads connecting any two cities have independent costs that are exponentially distributed with parameter one. Find the expected value of the minimal cost $c_n$ to connect all cities, assuming that $n=3$. That is, find $E(c_3)$.

He argues as follows. Let $X_i = $ the cost of the $i$-th cheapest road. Since we only have $3$ cities, the cheapest road connecting all three cities will of course simply be the two cheapest roads together.

So, $$E(c_3) = E(X_1 + X_2) = E(X_1) + E(X_2).$$

Now, $X_1$ is the minimum of three independent exponential rv’s each of which has parameter equal to $1$, so $X_1$ is (by the superposition principle) in distribution equal to $\exp(3)$. This immediately implies that $E(X_1)=1/3$. So far, so good, I (think I) understand it still.

But now we wish to calculate $E(X_2)$. The precise argument that he gives, I don’t know (for I don’t understand it). He says, consider $X_2$ and split it up as $X_2 = X_1 + (X_2-X_1)$. Then I think (?) he says that the distribution of the increment $X_2-X_1$ has distribution $\exp(2)$. But this I find weird, since I’d argued as follows: $X_2$ is the minimum of two independent exponentially distributed rv’s, so $X_2$ is $\exp(2)$. I do not understand it. He finally concludes that $E(c_3) = E(X_1) + E(X_2) = 1/3 + (1/3 + 1/2).$ So my problem is that I don't understand why $E(X_2) = 1/3 + 1/2.$

I’m generally confused by the memoryless principle (which, I believe, plays a role in this—although it’s not mentioned explicitly in any of the explanations in the video). Since I’ve gotten stuck, and extremely confused, and no further thinking by myself seems to help, it would be great if someone could shed some light on this.

It would be great to have a (formal) explanation, a proof, of this; to have it clearly stated which principle is used in which step, and how the claimed result follows.

Thanks in advance!

1

There are 1 best solutions below

7
On

Let me first give you some intuition. One way to think of these exponentially-distributed costs is as a waiting time. Let's say that:

  • Each road has a Poisson process associated with it, with a rate of $1$ event per second.
  • When the event happens to a road, is cost is determined: the cost is the number of seconds that have elapsed so far.

When we want to determine the cheapest road, we are waiting for the first of $3$ events to happen. The merged process has $3$ events per second, so we wait $\frac13$ seconds in expectation. The average cost is $\frac13$.

Once that happens, we resume waiting - but now we are only tracking two processes, because the cost of the cheapest road has already been determined. The merged process has $2$ events per second, so we wait $\frac12$ seconds longer in expectation. So the average total time we've waited for the second-cheapest road is $\frac13 + \frac12$.

(And if we add the costs, we get $\frac13 + (\frac13 + \frac12)$.)


Reasoning formally with random variables, the mistake you are making is saying

$X_2$ is the minimum of two independent exponentially distributed rv’s, so $X_2$ is $\text{Exp}(2)$.

$X_2$ is not the minimum of two exponentials; it is the median of three! Alternatively, it is the minimum of two random variables which are exponentials-conditioned-on-being-larger-than-the-smallest-exponential.

Let's say the individual roads cost $R_1, R_2, R_3$, so that $X_1 = \min\{R_1, R_2, R_3\}$. To deal with $X_2$, we can condition on the value of $X_1$, and then apply the law of total expectation.

Given that $X_1 = t$, there are three equally likely cases: $X_1$ can be $R_1, R_2, R_3$. We will average over all of them, but all of them give the same expected value for $X_2$, so it's enough to consider $X_1 = R_1 = t$.

Conditioned on this event, the cost of the 2nd road is no longer $R_2$ but $R_2 \mid R_2 > t$. Similarly, the cost of the 3rd road is $R_3 \mid R_3 > t$. Therefore \begin{align} (X_2 \mid X_1 = R_1 = t) &= \min\{ (R_2 \mid R_2 > t), (R_3 \mid R_3 > t)\} \\ &= \min\{ (R_2 - t \mid R_2 > t) + t, (R_3 - t \mid R_3 > t) + t\} \\ &= \min\{(R_2 - t \mid R_2 > t), (R_3 - t \mid R_3 > t)\} + t. \end{align} The motivation for pulling out the $t$ like this is that now, by memorylessness, the distributions of $(R_2 - t \mid R_2 > t)$ and of $(R_3 - t \mid R_3 > t)$ are independent $\text{Exp}(1)$'s. Therefore their min is $\text{Exp}(2)$, its expectation is $\frac12$, and the overall expectation of $X_2$ is $\frac12 + t$.

If $\mathbb E[X_2 \mid X_1 = t] = \frac12 + t$, then $\mathbb E[X_2 \mid X_1] = \frac12 + X_1$, so $$\mathbb E[X_2] = \mathbb E[\mathbb E[X_2 \mid X_1]] = \frac12 + \mathbb E[X_1] = \frac12 + \frac13.$$