Kruskal Algorithm proof

117 Views Asked by At

I was reading the proof of Kruskal Algorithm Reference.

What I am not understanding is why we assume that Kruskal always chooses the smallest edges? Maybe it did not choose one of the smallest edges because it could form a cycle? So in the proof here (and actually in many other ones)Reference, at the end of the proof, they conclude that $w(e_i) = w(e_0)$ since Kruskal chose $e_i$ which implies $w(e_i) \leq w(e_0)$, but maybe Kruskal didn't choose $e_0$ because it could form a cycle even though $w(e_0) \leq w(e_i)$?

3

There are 3 best solutions below

0
On

If $w(e_0)< w(e_i)$ and $e_0$ was not chosen then it means that it created a cycle with a subtree of $T$. This means that you will find an edge in $T$ that is not in $T_0$ and with strict smaller weight than $e_i$ which contradict the definition of $e_i$.

2
On

The idea is, you've made a forest $F$ out of your chosen edges.

By induction assume there exists $T$ a minimum spanning tree containing those edges and none of the rejected edges.

Then the algorithm chooses $e_1$ ($e_i$ in your link).

If $e_1$ is in $T$ then the inductive conclusion is established.

Otherwise, $T + e_1$ contains a cycle $C$.

$C$ contains an edge $e_2$ ($e_0$ in your link), distinct from $e_1$, not in $F$. Because otherwise $F+e_1$ would have a cycle.

$e_2$ cannot be smaller than $e_1$, because it would have been previously chosen by the algorithm. I think this is the part you are asking about. Adding $e_2$ to $F$ can't form a cycle because $T$ is a supergraph of $F$ and $T$ without $e_1$ doesn't form a cycle.

$e_2$ cannot be larger than $e_1$, because then $T + e_1 - e_2$ is a smaller spanning tree than $T$, violating the induction assumption.

So $T + e_1 - e_2$ is a minimum spanning tree containing $F+e_1$ and the inductive conclusion is satisfied.

0
On

In the linked proof $T$ is a spanning tree produced by Kruskal's algorithm and $T_0$ is a minimum spanning tree. Also $e_1, e_2, \ldots, e_{i-1}$ are all edges of both $T$ and $T_0$, and $e_i$ is an edge of $T$ but not one of $T_0$.

The edge $e_0$ is chosen to be an edge of the cycle $C$ in $G_0$, but not an edge of $T$. The edges of $G_0$ are the edges of $T_0$ together with $e_i$, which is an edge of $T$, so $e_0$ must be an edge of $T_0$.

Hence $e_1, e_2, \ldots, e_{i-1}$ and $e_0$ are all edges of the tree $T_0$ and trees don't contain cycles, so in Kruskal's algorithm choosing $e_0$ rather than $e_i$ as the next edge would not make a cycle.