Counting number of unlabeled graphs via number of connected graphs

328 Views Asked by At

Let $G$ be a family of simple, unlabeled and undirected graphs, $G(n)$ the graphs on $G$ with $n$ vertices and $G^c(n)$ the graphs of $G(n)$, which are connected. I thought about the problem of determing $|G(n)|$, when the values of $|G^c(k)|$ are known for $k = 1\dots n$.

All families I am looking at have the nice property, that the connected components of a graph $g \in G$ are also in $G$. The family of bipartite graphs for example has this property, since every connected component of a bipartite graph is bipartite.

I developed a formula for the problem described at the beginning, that i thought would be right, but implementing this formula calculates wrong numbers.

Some notation: Let $P(n)$ be the set of (distinct) number partitions of $n$ and for a partition $p = (p_1, p_2, \dots, p_m) \in P(n)$ let $l(p) := m$ the number of "parts" of $p$. For example for the partition $p = (4,2,2,1)$ of the integer $9$ is $l(p) = 4$

The formula i developed looks like this: $$|G(n)| = \sum_{p \in P(n)} \Bigl( \prod_{k=1}^{l(p)} |G^c(p(k))|\Bigr)$$

I came up with this formula because of some basic combinatorial arguments:

  • For a given partition $p = (p_1, \dots , p_m) \in P(n)$, the expression $\prod_{k=1}^{l(p)} |G^c(p(k))|$ counts the graphs of $G(n)$, which consist of connected components of sizes $p_1, \dots , p_m$
  • If a graph has connected components of sizes $p_1, \dots, p_m$ (ordered descending), then $(p_1, \dots, p_m) \in P(n)$

So if I summarize over all partitions of $n$ I should get the right answer. However, i implemented this formula for my bachelor thesis and get wrong numbers. Here is a table showing the numbers for bipartite graphs:

\begin{array}{c|c|c|c|c|c} n & 8 & 9 & 10 & 11 & 12 \\ \hline \text{right number} & 303 & 1119 & 5479 & 32303 & 251135 \\ \hline \text{calculated number} & 306 & 1122 & 5495 & 32322 & 251320 \\ \hline \text{difference} & 3 & 3 & 16 & 19 & 185 \end{array}

For $n \leq 7$, the numbers are right. Above $n=7$, the numbers calculated with the formula (second row) are higher than the right numbers (first row). In the third row i put the difference between the two.

My question is:

Did I miss some combinatorial argument while developing the formula? Are there for example some edge cases i am missing so that some graphs get counted twice with my formula?

2

There are 2 best solutions below

4
On BEST ANSWER

The problem shows us up in cases where you have

  1. A partition with two parts of the same order $s$, and
  2. More than one graph of order $s$ in your family.

For the family of bipartite graphs, this needs $n \ge 8$ to happen, because only at $s=4$ do we have more than one connected bipartite graph of order $s$. They are $C_4$ (the cycle with $4$ vertices), $P_4$ (the path with $4$ vertices), and $K_{1,3}$ (the star with $4$ vertices).

When you're counting graphs with two components of order $4$, a graph with two different components is double-counted. For example, say $G$ is the disjoint union of $P_4$ and $C_4$. Then it's counted once when you choose $P_4$ as the first component, and $C_4$ as the second; it's counted again when you choose $C_4$ as the first component, and $P_4$ as the second.

As a result, the disjoint unions $P_4 \sqcup C_4$, $P_4 \sqcup K_{1,3}$, and $C_4 \sqcup K_{1,3}$ are the $3$ bipartite graphs of order $8$ that you've double-counted.

The other answer gives you a way to solve this problem with generating functions. If you prefer to stick to the same kind of formula you already have, you will need to update the $$\prod_{k=1}^{l(p)} |G^c(k)|$$ term in your formula. Let $n_s(p)$ denote the number of parts of order $s$ in $p$. Then the number of ways to choose the components of those orders is given by the multiset number $$ \left(\!\!\binom{|G^c(s)|}{n_s(p)}\!\!\right) = \binom{|G^c(s)|+n_s(p)-1}{n_s(p)} $$ and so we change the formula to $$ \sum_{p \in P(n)} \prod_{s \ge 1}\binom{|G^c(s)|+n_s(p)-1}{n_s(p)}. $$

0
On

The formula you are using is not correct. The correct formula uses ordinary generating functions. Suppose we have unlabeled objects which may or may not be connected and all objects are constructed from connected sums of multisets of connected objects. Define the two related ordinary generating functions $$ C(x) := \sum_{n=0}^\infty c_n x^n, \quad A(x) := \sum_{n=0}^\infty a_n x^n = \prod_{k=1}^\infty (1 - x^k)^{-c_k} \tag{1} $$ where $\ c_n\ $ is the number of connected objects of "size" $n$ and where $\ a_n\ $ is the number of objects of "size" $n$. In the case of bipartite graphs, the OEIS sequence A005142 is the number of connected bipartite graphs with $n$ nodes and the OEIS sequence A033995 is the number of bipartitie graphs with $n$ nodes. Note that, for example,

$$ 1 + 1x + 2x^2 + 3x^3 + 7x^4 + 13x^5 + 35x^6 +\ ... = \\ (1 - x^1)^{-1} (1 - x^2)^{-1} (1 - x^3)^{-1} (1 - x^4)^{-3} (1 - x^5)^{-5} (1 - x^6)^{-17} \cdots \tag{2} $$ is the factorization of the number of bipartite graphs generating function.

Each connected object of "size" $k$ contributes exactly one factor $$ (1-x^k)^{-1} = \sum_{i=0}^\infty (x^k)^i \tag{3} $$ to the product. Here $i$ is the multiplicity of the object in the multiset.