Consider a Hamming graph $H(d, q)$. What is the rank of its incidence matrix?
Hypergraph
A Hamming graph can be seen in two ways:
- Take the numbers $0$ until $q^d - 1$ and represent them in base $q$. Each number is a vertex. Vertices are connected if exactly one base-$q$ digit is different.
- Take $d$ complete graphs, each with $q$ vertices. The Hamming graph is the Cartesian product of these graphs.
I consider the Hamming graph to be a hypergraph, with each hyperedge connecting the $d$ vertices that differ in one digit given their base-$d$ representation. As a result, the incidence matrix has $q$ 1s per column. For example, for $H(2, 3)$, the incidence matrix would be $$ \begin{bmatrix} 1 & 0 & 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 & 1 & 0 \\ 1 & 0 & 0 & 0 & 0 & 1 \\ 0 & 1 & 0 & 1 & 0 & 0 \\ 0 & 1 & 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 & 1 & 0 \\ 0 & 0 & 1 & 0 & 0 & 1 \end{bmatrix}. $$
q = 2
I found a proof (see Lemma 3) that the rank is $2^d - 1$ if $q = 2$. Intuitively, this makes sense to me: If you transpose the incidence matrix you get the $A$ in the system of linear equations $Ax = y$ for some $y$. If you know just a single $x_i$ you can "chain" the solution to all other equations. This means there is only a single free variable, so the rank is $2^d - 1$.
q > 2
But what happens when you change $q$? I've constructed the general form of the incidence graph $G_{d, q}$ of $H(d, q)$: $$ \begin{bmatrix} G_{d - 1, q} & 0 & \dots & 0 \\ 0 & G_{d - 1, q} & \dots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \dots & G_{d - 1, q} \\ I_{q^{d - 1}} & I_{q^{d - 1}} & \dots & I_{q^{d - 1}} \end{bmatrix}. $$ Containing $q$ copies of $G_{d - 1, q}$, this is a $d q^{d - 1} \times q^d$ matrix
I wrote some code which varies the parameters $d$ and $q$, and extrapolating from those results I found the general formula $q^d - (q - 1)^d$.
Question
I want to use this formula in my thesis, and while I have an intuition as to why this formula might be correct, I find it difficult to give a proof of its correctness. Despite my best efforts, I couldn't find any literature that confirm or contradict my results.
Another paper by W. Fish gives a general formula for the rank of the incidence matrix, but she uses regular edges instead of hyperedges, resulting in a different formula, $q^d - 1$, but this doesn't seem to be correct for my case based on the results of my code.
Could anyone point me to existing literature about this problem, or does someone know how to prove that my formula is correct?
I was able to solve this problem in my master thesis, Privacy-Preserving Data Aggregation with Probabilistic Range Validation. It's in Theorem 6.4 in Section 6.2.2.
A few notes:
My proof uses complete induction, similar to the proof by W. Fish in the paper I linked in my question. The base case is a 1-dimensional hypermesh, which is trivial. The induction case involves a sequence of column operations such that the matrix is then in a column echelon form. I then count the number of empty columns to calculate the rank.