TLDR
I'm seeking a proof for the following claim:
Let $T$ be any full tree of degree $n$. Let $m$ be the total number of nodes in $T$, and let $r$ be the number of regular nodes. Then $m > n \cdot r$.
Background
I recently came across a computer science problem, which — in essence — sought to reconstruct a full binary tree (ie. of degree 2) from a binary sequence. The sequence was formed through a depth-first traversal, in which terminal nodes ("leaves") were recorded as 0 and regular nodes as 1. For example, the tree
1
├── 1
│ ├── 0
│ └── 0
└── 1
├── 0
└── 1
├── 0
└── 0
would be encoded as 110010100.
The real challenge was to determine where each branch of the tree began and ended. For example, the encoding might be a leading subsequence in a longer sequence: 110010100010110111010010. Rather than iterate with complicated logic, I sought a mathematical shortcut.
After examining a few cases, I gravitated toward the ratio $\frac{t}{r}$ of terminal nodes $t$ to regular nodes $r$. It seemed that for a branch (or tree) of degree $n$, the encoding would end as soon as $\pmb{\frac{t}{r} > n - 1}$. If we let $m = t + r$ be the total number of nodes in the tree, then we can rewrite this inequality as
\begin{equation} \begin{split} n - 1 & < \frac{t}{r} = \frac{m - r}{r} = \frac{m}{r} - \frac{r}{r} = \frac{m}{r} - 1 \\ n & < \frac{m}{r} \\ n \cdot r & < m \end{split} \end{equation} which can be formalized in the following claim:
Claim
"Theorem": Let $T$ be any full tree of degree $n$. Let $m$ be the total number of nodes in $T$, and let $r$ be the number of regular nodes. Then $m > n \cdot r$.
Question
I am seeking either
- a reference to this "theorem" and its proof, if they already exist; or
- an original proof of this statement.
I strongly suspect that (1) already exists for my claim in some graph theoretic form, and I would be shocked if an original proof were necessary. However, any original (and rigorous) alternatives are welcome.
Informal Solution: Induction
I inclined toward a proof by Induction, which leverages the idea of "minimally extending" a full tree $T$: converting exactly one terminal node into a regular node, which has exactly $n$ terminal nodes as children. In so creating a new tree $T'$, we "lose" one terminal node but "gain" $n$ more, and we also gain one regular node: $t$ becomes $t + n - 1$ and $r$ becomes $r + 1$.
Definition
Define the statement $$ S(i)\text{: } m > n \cdot r \text{ for any tree } T \text{ formed by } i \in \mathbb{N} \text{ minimal extensions from the trivial tree } T_0 \text{.} $$
Base Case
Now there is only one trivial tree $T_0$, where the root is terminal
0
and here we have $m_0 = 1 > 0 = n \cdot (0) = n \cdot r_0$. Since $T_0$ has only one terminal node, it has only one extension $T_1$
1
├── 0 1
├── 0 2
⋮ ⋮ ⋮
└── 0 n
and here we have $m_1 = n + 1 > n = n \cdot (1) = n \cdot r_1$. Hence $S(1)$ holds.
Inductive Hypothesis
Suppose that $S(k)$ for some $k \in \mathbb{N}$: $m > n \cdot r$ for any tree $T$ formed by $k$ minimal extensions from $T_0$.
Induction Step
Now let $T_{k + 1}$ be any tree formed by $k + 1$ minimal extensions from $T_0$; and let $T_k$ be its predecessor, one extension earlier. By definition, $T_k$ was a tree formed by $k$ minimal extensions from $T_0$, so $m_k > n \cdot r_k$ by our inductive hypothesis $S(k)$. Furthermore, since a minimal extension
- loses one terminal node but gains $n$; and
- gains one regular node
we have $t_{k + 1} = t_k + n - 1$ and $r_{k + 1} = r_k + 1$ respectively. Then
\begin{equation} \begin{split} m_{k + 1} & = t_{k + 1} + r_{k + 1} = \left(t_k + n - 1\right) + \left(r_k + 1\right) = \left(t_k + r_k\right) + n \\ & = \pmb{m_k + n > n \cdot r_k + n} \\ & = n \cdot \left(r_k + 1\right) = n \cdot r_{k + 1} \end{split} \end{equation}
and we have proven $S(k + 1)$: $m > n \cdot r$ for any tree $T$ formed by $k + 1$ minimal extensions from $T_0$.
Roadblock
Unfortunately, there remains a crucial gap: how do we prove that every full tree $T$ can be formed by such a finite sequence of minimal extensions from $T_0$?
A tree with p parents(i.e. regular nodes), will have 2p+1 regular nodes
We can see this via inclusion-exclusion:
Number of nodes = Number of parents + Number of children - Number of (Parents and children)
$$ = p + np - (p-1) = np+1$$
Here, Number of (Parents and children) = p-1 as only the root node is a parent that cannot be a child. All other parents are a child to some other node.
This actually proves your claim itself. I wanted to try the second problem as well- that the extension actually generates all full trees of degree n(in fact I completely forgot about the main point until after posting XD). So here goes:
This gives us a corollary, any tree can only have nodes $m_k\in \{1+kn|k\in\{0,1,2...\}\}$.
Lemma: Given any tree with kn+1 nodes, the parents can be at a maximum distance of k from the root node. Here, distance means the number of nodes we need to cross to reach the child node from the root. (proof left to the reader)
Since the parent nodes' distances are bounded, we can order them accordingly into a set $S_k$(ties can be broken randomly). As no parent can be further away, the last parent in the set will have all its children as leaf nodes.
Now, we shall define an inverse operation: using the defined set $S_k$, we can apply an operation, which converts the last parent to a leaf node. The resultant tree formed will have kn+1-n = (k-1)n+1 nodes. The new tree's set $$S_{k-1} = S_k\text{\ } \{\text{the last parent}\}$$ Note that if we apply the original extension operation to the newly created leaf node, we get back $S_k$. Hence, it is an inverse operation.
The set $S_k$ is decreasing and bounded below by $T_0$(which has the set $\phi$). Since each operation decreases the set by 1 element, we will need exactly k operations.
The sequence of k operations can be reversed and applied to $T_0$ to get any full tree.