Is there an underlying graph-theory representation of Sudoku solutions?

136 Views Asked by At

I have been puzzling for some time about how a completed 9x9 Sudoku solution can be represented mathematically, and how that mathematical representation can be used to enumerate the different mathematically unique arrangements of the cells.

I am only a high-school level mathematician, so my understanding of the subject area and its possible solutions may be a stretch, but nevertheless I would be interested in whether anyone has done work on this subject.

I understand that a sudoku board can be considered as just a 2-d representation, for convenience, of an underlying graph, or maybe hypergraph, which more fundamentally represents the true structure, i.e. the relationships between the cells, of the board. However, a completed Sudoku board can be rearranged by swapping rows within each 3-row group, swapping columns within each 3-column group, swapping complete 3-row or 3-column groups around, rotating and mirroring the board, and even switching the numbers around, for example changing all the 6 cells to 3 and 3 cells to 6. What I am looking for is a representation of the underlying canonical "shape" of the board which is the same regardless of the transformations. All the transformations are simply presentational changes in the way the underlying unique shape is projected onto a 2-d surface.

The problems I am struggling with are:

a) How can I determine the canonical underlying mathematical definition, i.e. the "shape", of any given completed Sudoku board?

b) How many unique Sudoku "shapes" are there?

It seems that most articles or papers that discuss Sudoku look at the maths of the puzzle process, i.e. setting and solving, rather than the maths of the underlying solution. In this case, puzzles that actually represent the same underlying shape (i.e. one puzzle can be rearranged, or "transformed", to the other) are considered as distinct puzzles - not as two instances of the same puzzle.

In respect of problem a), I have made some attempts at defining the rules for transforming a completed board into its underlying canonical mathematical form. I have tried developing an algorithm that classifies the relative positioning of cells of the same number and different numbers across the board (ignoring the number itself) so that there is a representation of the "shape" (as a graph/network) that does not change even if the 2-d board is rearranged, e.g. by swapping rows and columns or rotating/flipping the board. I have not succeeded (yet).

I guess combinatorics will come into the answer to b), but I have not even attempted that. What interests me here is that maybe there are relatively few underlying unique shapes and, if so, perhaps it would be fun to apply the process identified in a) to thousands or millions of completed grids to see how many of their underlying canonical shapes can actually be found. Also, maybe there are classes of different types or families of shape that have different mathematical or topological properties.

If anyone has any suggestions or can point me in the direction of any work already done in this area I would be very grateful.

1

There are 1 best solutions below

1
On

One approach is to represent it as a $9$-uniform hypergraph: there are $81$ vertices, and every row, column, and $3\times 3$ box is a hyperedge. This representation tells us what all the constraints are without having to specify the grid layout.

Actually, the problem of solving the Sudoku also has a graph-theoretic formulation. A completed Sudoku solution is a proper coloring of the hypergraph. A Sudoku puzzle is a partial coloring that we want to finish.

Probably the most famous hypergraph coloring problem is the Erdős–Faber–Lovász conjecture. It says that if the Sudoku hypergraph were linear, then we would need at most $9$ colors to color it. (Well, it says the generalized version of that.) Unfortunately, the Sudoku hypergraph is not linear: a hyperedge corresponding to a $3 \times 3$ box can overlap in multiple vertices with a hyperedge corresponding to a row or column. However, the conclusion of the Erdős–Faber–Lovász conjecture still holds: we can fill in a Sudoku with only $9$ distinct symbols! (Of course, this doesn't imply that every partial solution can be completed.)