Intuitive basis of Mobius inversion?

953 Views Asked by At

If we're given $f(n)= \sum_{d|n}g\left(\frac{n}{d}\right),n \in \mathbb{N},$ then Möbius inversion gives $$g(n)=\sum_{d|n}\mu \left( d\right) f \left( \frac{n}{d}\right).$$ Also, the generalised Möbius inversion formula states that the above is correct for $n \in \mathbb{R}$ if we change $d|n \to1\le d\le n$.

I'm familiar with the standard proof of the vanilla and generalised inversion formulae, but it relies on blind algebraic manipulation of Dirichlet convolution (and a generalised convolution between arithmetic and non-arithmetic functions), which I find unsatisfactory.

Is there a more intuitive way of looking at the (generalised/ungeneralised) Möbius inversion formula, perhaps with knowledge garnered from more advanced theory?

3

There are 3 best solutions below

0
On

Best approach would be to compute the cyclotomic polynomial for $\Phi_n(X)$ by attempting to factorize $X^n-1$, using inclusion-exclusion. You want to get hold of a polynomial with integer coefficients satisfied by, say, 30th roots of unity, which are not roots of unity of lower order (i.e., they are primitive). Then it has to be a factor of $X^{30}-1$. We have to remove 15th roots of unity: So divide out by $X^{15}-1$. But $-1$ is still an undesirable root. So divide out by $X+1$ and continue. You will rediscover this principle.

0
On

The inversion formula can be discovered by working with many examples. Write out $f(1) = g(1)$, $f(2) = g(2) + g(1)$, $f(3) = g(3) + g(1)$, $f(4) = g(4) + g(2) + g(1)$, and so on, and then successively solve for $g(n)$ in terms of $f$-values: $$ g(1) = f(1), g(2) = f(2) - f(1), g(3) = f(3) - f(1), g(4) = f(4) - f(2) $$ and so on. With enough data you discover $g(n)$ is a sum of $\pm f(n/d)$ for $d \mid n$ where $d$ is always squarefree. Understanding the pattern in the $\pm$ coefficients leads you eventually to the Moebius function.

The Moebius function is a very unnatural thing when you first see it. There is no way around that. By working with it enough, the function doesn't feel so strange anymore.

The most important property of the Moebius function is that $\mu(1) = 1$ and $\sum_{d \mid n} \mu(d) = 0$ for $n > 1$. These properties in fact characterize it (only one sequence of numbers can fit these conditions). And these are the properties you need to prove Moebius inversion.

There are efficient ways of explaining Moebius inversion using its role as coefficients in the Dirichlet series $1/\zeta(s) = \sum_{n \geq 1} \mu(n)/n^s$, but I don't know if you are familiar with Dirichlet series.

0
On

My favorite explanation for Moebius Inversion is the same as my favorite explanation of a lot of things: It's secretly linear algebra!

Call a poset $(P, \leq)$ "locally finite" if for any $a, b \in P$ the set $[a,b] = \{ x \mid a \leq x \leq b \}$ is finite.

Now for any locally finite $P$ let's look at the Incidence Algebra of $P$, defined as

$$ I(P) \triangleq \{ \alpha : P \times P \to \mathbb{C} \mid \alpha(x,y) = 0 \text{ if } x \not \leq y \} $$

This might feel like it's being pulled out of a hat (and it is), but you asked for "knowledge garnered from more advanced theory", and the price we pay for that more advanced theory is that sometimes it's not a priori clear why we choose the definitions we do. But trust that this can be motivated, and let's carry on with the show.

Now, $I(P)$ is obviously a vector space over $\mathbb{C}$, and we can identify our choices of $\alpha$ with square matrices! The columns and rows are indexed by $P$, and the entry $A_{x,y} = \alpha(x,y)$. Notice now what matrix multiplication looks like:

$$(\alpha \star \beta)(x,y) = \sum_z \alpha(x,z) \beta(z,y)$$

but now the definition of $I(P)$ begins to motivate itself! Since $\alpha(x,z) = 0$ whenever $x \not \leq z$ and $\beta(z,y) = 0$ whenever $z \not \leq y$, the local finiteness of our poset ensures this sum is actually finite!

$$(\alpha \star \beta)(x,y) = \sum_{x \leq z \leq y} \alpha(x,z) \beta(z,y)$$

Of course, in the divisibility lattice, we get a special case

$$(\alpha \star \beta)(m,n) = \sum_{m \ \mid \ d \ \mid \ n} \alpha(m,d) \beta(d,n)$$

So now matrix multiplication is the secret to many of the convolution formulas you've seen! Before we move on, we'll want to know when we can invert matrices in $I(P)$, and a bit of work shows

$\alpha$ is invertible if and only if for all $x \in P$, $\alpha(x,x) \neq 0$.


Now, there's an obvious "canonical" element of $I(P)$ which we can associate with our poset. For historical reasons, we call it $\zeta$, and it's defined as follows:

$$ \zeta(x,y) = \begin{cases} 1 & x \leq y \\ 0 & \text{else} \end{cases} $$

Notice that, $\zeta(x,x) = 1 \neq 0$ for all $x$. So $\zeta$ must be invertible in $I(P)$! Obviously you can see the punchline coming, but we write

$$ \mu \triangleq \zeta^{-1}. $$

What good is this, though? Well, let's consider a sum

$$ f(n) = \sum_{d \mid n} g(d) $$

We recognize this is a matrix-vector multiply. If $\vec{g} \triangleq (g(1), g(2), g(3), \ldots)^T$ and $\vec{f} \triangleq (f(1), f(2), f(3), \ldots)^T$, then we find

$$ \vec{f} = \zeta \vec{g} $$

since, if we look at the $n$th coordinate of $\vec{f}$, we see

$$ f(n) = \sum_{1 \mid d \mid n} \zeta(1,d) g(d) = \sum_{d \mid n} 1 \cdot g(d) $$

Of course, now moebius inversion becomes almost tautological!

If $\vec{f} = \zeta \vec{g}$, then we must have $\vec{g} = \zeta^{-1} \vec{f} = \mu \vec{f}$.

And indeed, we recover the moebius inversion formula by looking at the $n$th component of $\vec{g}$:

$$ g(n) = \sum_{1 \mid d \mid n} \mu(1,d) \cdot f(d) $$

and with some massaging, this formula turns into the "classical" formulation of Moebius Inversion.


The theory of incidence algebras is extremely cool, and is a very powerful technique in algebraic combinatorics. As an exercise, you might be interested in checking for yourself that the theory of moebius inversion in the powerset lattice recovers the principle of inclusion/exclusion! You might start with a small powerset, and manually invert $\zeta$ (or ask a computer) to compute $\mu$, then check that PIE falls out.

This is all very effective, and there are explicit formulas which will let you compute, say, the moebius function $\mu$ of your favorite poset given only the combinatorial data. In fact, it's so effective that you can see here for a sage package that will automatically compute these things for you!

If you want to read more about these, the original paper where the theory was developed is quite legible. See On the foundations of combinatorial theory I. Theory of Möbius Functions by Gian-Carlo Rota. For a textbook account, you can find this treated in Chapter 25 of van Lint and Wilson's A Course in Combinatorics, which is honestly a book worth having in its own right.


I hope this helps ^_^