Strange distribution of triangle shapes connecting the leaves of a perfect $k$-ary tree

63 Views Asked by At

The graph metric on the leaves of a perfect $k$-ary tree is ultrametric, as with any uniform-depth tree. Roughly speaking, this means that for any three leaves, the "triangle" of paths connecting them is either equilateral or "skinny isosceles", with the base shorter than the legs. We can partially characterize the "skinniness" of one of the isosceles triangles by the positive natural number $n := \text{(leg length) } - \text{ (base length)}$, with equilateral triangles having skinniness $n = 0$.

By running numerical simulations, I find that in the limit of large uniform depth, the fraction of all possible leaf-connecting triangles that have skininess $n$ appears to be $$\cases{\frac{k-2}{k+1} & if $n = 0$, \\ 3 \frac{k-1}{k+1} k^{-n} & if $n > 0$.}$$

It seems reasonable to me that the fraction of triangles with skinniness $n$ falls off exponentially in $n$ with base $k$, but I don't understand why the isosceles expression doesn't extrapolate naturally back to the equilateral case; there are many fewer equilateral triangles than the $3(k-1)/(k+1)$ that we would get from simply extrapolating the $n > 0$ expression back to $n = 0$. Is there a simple way to derive the distribution above?

1

There are 1 best solutions below

5
On BEST ANSWER

We can describe a leaf in a perfect $k$-ary tree of depth $d$ as an element of $[k]^d$: an sequence of $d$ integers from $[k] = \{1,2,\dots,k\}$. (This sequence just tells us the path from the root to the leaf by specifying the edge to take at each step.)

With this description, the distance between two leaves is determined by the number of positions at the beginning of their sequences that they share: the more, the closer they are. For example, $(3,1,4,1,5)$ is closer to $(3,4,5,6,7)$ than to $(1,2,3,4,5)$, because they share the $3$, and even closer to $(3,1,4,2,8)$.

An equilateral triangle is a triple of leaves whose sequences have the same values for the first few steps, then they all diverge simultaneously. The number of these is $$ \sum_{i=0}^{d-1} k^i \cdot (k(k-1)(k-2)) \cdot (k^3)^{d-i-1} $$ where the $i^{\text{th}}$ term counts equilateral triangles that diverge after $i$ steps. (The factor of $k^i$ tracks their common steps, then $k(k-1)(k-2)$ tracks the number of ways in which they can split, and then we pick up a factor of $k^3$ for each of the remaining steps they take.) There are $k^{3d}$ triples of leaves total (allowing for repetition, which is insignificant), and so the fraction of triangles that are equilateral is $$ \frac1{k^{3d}}\sum_{i=0}^{d-1} k^i \cdot (k(k-1)(k-2)) \cdot (k^3)^{d-i-1} = \sum_{i=0}^{d-1} k^i \cdot (k(k-1)(k-2)) \cdot \frac1{k^{3i+3}} = \frac{k-2}{k+1}(1 - k^{-2d}). $$ (I'm cheating a bit, because the $k^{3d}$ allows for degenerate equilateral triangles, but the sum doesn't, but the difference vanishes as $d \to \infty$.) As $d \to \infty$, this fraction approaches $\frac{k-2}{k+1}$.

For isosceles triangles with skinniness $s$, the corresponding sum is $$ \sum_{i=0}^{d-s-1} k^i \cdot 3k(k-1) \cdot (k^2)^{s-1} \cdot k^2(k-1) \cdot (k^3)^{d-i-s-1}. $$ The paths all stay together for $i$ steps, then one splits off from the other two in $3k(k-1)$ ways, then there are two paths for $s-1$ more steps, then the two paths that are still together split up in $k^2(k-1)$ ways (that is, they split in $k(k-1)$ ways, and the third path also has $k$ options), and then finally there are three paths for the the remaining $d-i-s-1$ steps.

Again, if we divide by $k^{3d}$, we get a geometric sum $$ \frac1{k^{3d}} \sum_{i=0}^{d-s-1} k^i \cdot 3k(k-1) \cdot (k^2)^{s-1} \cdot k^2(k-1) \cdot (k^3)^{d-i-s-1} = \frac{3(k+1)}{(k-1)k^s}(1 - k^{-2d}). $$ This approaches $\frac{3(k-1)}{(k+1)k^s}$ as $d \to \infty$.

The intuition for why we get $k-2$ when $s=0$ and $3(k-1)$ for $s>0$ is that if three paths split up in two stages, we have more freedom to choose how they split up: we get to pick which path splits off from the other two first, and we have a bit more freedom when we ask for paths that split apart to be distinct.