Predecessor probabilities regarding Uniform Attachment Tree

29 Views Asked by At

Summary:

My question is: What is a formal argument for the fact that for a given uniform attachment tree $G$ with arbitrary vertex labels, each possible predecessor (i.e., the tree before its last step of construction) is equally likely?


More details:

A uniform attachment tree is a tree which grows according to the following procedure:

  1. Start with an initial vertex, say $V \gets \{ v_1 \}$.
  2. For $k = 2, \dots, n$, attach vertex $v_k$ to an existing vertex $w \in V$. $w$ is chosen uniformly at random.

Therefore you end up with a random tree of size $n$ with $n-1$ edges.

Now, when given a tree $G$ with arbitrary vertex labels (i.e., we do not know which vertex was added at which step), we would like to know what the probabilities are for the previous step. My intuition says that each possible predecessor is equally likely, or in other words $$ \mathbb P (G_{n-1} = H ~|~ G) = \frac 1 l $$ where $l \in \mathbb N$ is the number of leaves and $H$ a possible predecessor.

A possible predecessor of $G$ is a graph which is similar to $G$ except that a leaf vertex $v$ has been removed from the vertex set( and consequently all edges which contain $v$).

Despite having an inituition for the problem, I'd like to have some formal proof of this.

1

There are 1 best solutions below

2
On BEST ANSWER

The predecessor probabilities are not all equal.

Consider the following tree:

a -- b -- c -- d
          |
          e

There are $28$ possible orders in which this tree could have been built. Of them, $8$ end with e (abcde, bacde, bcade, bcdae, cbade, cbdae, cdbae, dcbae), and $8$ end with d (abced, baced, bcaed, bcead, cbaed, cbead, cebad, ecbad), but $12$ end in a (bcdea, bceda, cbdea, cdbea, cdeba, cbeda, cebda, cedba, dcbea, dceba, ecbda, ecdba), so the three possible predecessors have probabilities of $\frac27$, $\frac27$, and $\frac37$.

Some intuition for this calculation: we can think of a uniform attachment tree as a "uniform detachment" tree in which we start with the tree above and detach leaves one at a time until we're left with a single vertex. All possible detachment orders are equally likely, because they are all reverse orders of how a uniform attachment tree could have been built. In the tree above:

  • If d or e is detached first, then we are left with $P_4$, which has $8$ detachment orders: for each step, we decide whether to detach the left leaf or the right leaf.
  • If a is detached first, then we are left with $K_{1,3}$, which has $12$ detachment orders: there's $3 \cdot 2$ ways to detach two leaves of $K_{1,3}$, which leaves us with a $P_2$ that we can turn into a vertex in $2$ ways.

In general, the probabilities of the predecessors are proportional to the number of detachment orders they have. This has a recursive definition: if $d(T)$ is the number of detachment orders of a tree $T$, then $$d(T) = \sum_{v \in L(T)} d(T-v)$$ where $L(T)$ is the set of leaves of $T$. If $G_1, \dots, G_n$ is the sequence of trees built by uniform attachment, then $$\Pr[G_{n-1} = T-v \mid G_n = T] = \frac{d(T-v)}{d(T)}$$ for all $v \in L(T)$.