Prüfer code generates a connected graph?

892 Views Asked by At

I'm reading the ProofWiki page which describes the construction of a labeled tree from its Prüfer sequence, and am having trouble understanding this section:

The fact that $T$ is, in fact, a tree follows from the fact that:
a) $T$ has $n$ nodes and (from the method of construction) $(n−1)$ edges;
b) Each new edge connects two as yet unconnected parts of $T$

Specifically, I don't follow claim b) above, that each new edge connects two as yet unconnected parts of $T$. I'm trying to show that it's true as follows:

At a given step $k$, suppose that we are adding the edge $e_k = \{a_k,b_k\} $ to the graph $G_{k-1}$ with edges $e_1,\dots,e_{k-1}$, then if $a_k$ and $b_k$ were already in the same connected component, there would be a path from $a_k$ to $b_k$ in $G_{k-1}$, which together with $e_k$ gives a cycle in $G_k$. I am hoping to somehow deduce a contradiction from this, but am not really sure where to take this argument.

Any help is appreciated!

2

There are 2 best solutions below

2
On BEST ANSWER

I'll describe a somewhat different version of the algorithm and leave it to you to see if they're equivalent.

Start with $n$ unmarked vertices (1 to $n$); each connected component has one unmarked vertex. Whenever we add an edge, we'll mark one of its endpoints, so that the newly formed connected component still has exactly one unmarked vertex. Then we'll make sure to only add edges between two unmarked vertices.

Since we make one mark for each entry of $P$, when we get to entry $a_i$, there are $n - i + 1$ unmarked vertices left, but only $n - i - 1$ entries of $P$ left (including $a_i$ itself), so at least two unmarked vertices do not appear in the rest of the sequence (pigeonhole principle!) Take the smallest one as $u$, add the edge $u a_i$, and mark $u$. The edge joined two unmarked endpoints, so it joins two different connected components. We know that $a_i$ was not marked at any previous step, because we always choose the endpoint to mark from among the vertices which aren't in the remainder of the sequence.

(On ProofWiki, the "list" is just the set of unmarked vertices.)

0
On

Given any sequence $b_1,...,b_{n-2}$ where $b_i\in[n]$ for all $i\in[n-2]$, and we suppose that $b_{n-1}=n$, we can realize it as a "Prüfer code" (technically we do not know whether the sequence represents a tree since we do not know the map from the set of trees to the set of sequences is surjective or not, and that is what we are trying to prove now.) by following the algorithm so that we can find another list $a_1,...,a_{n-1}$ where $a_i\in[n]$ for all $i\in[n-1]$ which satisfies the following properties:

  1. $a_1$ is the smallest integer in the set $[n]\setminus\{b_1,...,b_{n-2}\}$;
  2. For each $i\in[n-2]\setminus \{1\}$, $a_i$ is the smallest integer in the set $$[n]\setminus(\{a_1,...,a_{i-1}\}\cup\{b_{i},...,b_{n-2}\});$$
  3. $a_{n-1}$ is the unique element in $[n]$ which does not appear in the list: $$ \{a_1,...,a_{n-2},n\} .$$

Note that the above properties are also deterministic, indeed, it is exactly the algorithm. And of course, we add an edge between $a_i$ and $b_i$ every time we have determined a new $a_i$.

Now we are ready to prove the claim b): Each new edge connects two as yet unconnected parts of $T$. Proceed by induction. The base step is clear. Suppose we have already determined $a_i$ for some $i\in[n-2]$, which means we have got a forest.

If $a_{i+1}\notin\{b_1,...,b_{i}\}$, then $a_{i+1}$ has never been joined in the first $i$ steps so $a_{i+1}$ itself is a connected component and the edge to be drawn between $a_{i+1}$ and $b_{i+1}$ surely coonnects two as yet unconnected parts of $T$.


Otherwise $a_{i+1}\in\{b_1,...,b_{i}\}$, say $a_{i+1}=b_{k}$ for some $k\in[i]$. Also, $b_{i+1}\notin\{a_1,...,a_{i}\}$ by the algorithm which implies $b_{i+1}$ has never been added to any other vertex before.

  • If $b_{i+1}\notin\{b_1,...,b_i\}$, then $b_{i+1}$ has never been joined in the first $i$ steps so $b_{i+1}$ itself is a connected component and the edge to be drawn between $a_{i+1}$ and $b_{i+1}$ surely coonnects two as yet unconnected parts of $T$.

  • Otherwise, $b_{i+1}\in\{b_1,...,b_i\}$, say $b_{i+1}=b_{k'}$ for some $k'\in[i]$ (and of course $k\neq k'$, otherwise it will result in $a_{i+1}=b_k=b_{k'}=b_{i+1}$ which can never happen by the algorithm).

We claim that $b_k$ and $b_{k'}$ belong to different connected components. To see this, notice that we are joining $b_k(=a_{i+1})$ towards $b_{k'}(=b_{i+1})$. If $b_k$ is already in the same connected component with $b_{k'}$ before we add an edge between $b_k$ and $b_{k'}$ then we can find a unique path $b_{k'},c_1,...,c_m,b_k(=c_{m+1})$ connecting $b_k$ and $b_{k'}$ where $c_1,...,c_m\in\{a_1,...,a_i,b_1,...,b_i\}$ and $m\geq 1$.

Now there are two possibilities,

  1. $c_1b_{k'}$ happens prior to $c_1c_2$: $c_1$ is first added to $b_{k'}$ so $c_2$ must be added to $c_1$ later, but this cannot happen by the algorithm(property 2).
  2. $c_1c_2$ happens prior to $c_1b_{k'}$: $c_2$ must be first added to $c_1$, then $c_1$ is added to $b_{k'}$ later. So this is the only case that can happen. Replace $b_{k'}$ by $c_1$ and replace $c_1$ by $c_2$, we can keep arguing in the same manner until we reach $b_k$. We conclude that $b_k$ must be added to $c_m$. But after joining $b_k$ to $c_m$, we can no longer join $b_k$ to $b_{k'}$. Contradiction! And we have proved that $b_k$ and $b_{k'}$ belong to different connected components. By induction we know that claim b) is true.