[math-fun] sudoku, k-SAT, dancing links

27 Jul 2005

      Victor wrote
...
Well, It's certainly possible to solve a Sudoku puzzle without trial
and error and guessing.  Here's the proof: you can pose the solution
of Sudoku as a problem in satisfiability (SAT): ............
...
You can solve any solvable SAT problem via resolution (which is like
Gaussian elimination), and back substitution.  This doesn't involve
any guesswork.  However, it could be exponential (since resolving on a
variable can create a large number of new clauses).
Could it be exponential because it is searching and backtracking?  
Creating a large number of
new clauses sounds a bit like searching through a tree.

I wonder is this somehow equivalent to using the 729x324 matrix which 
people have discussed.
It puts sudoku into the framework of a "covering" problem:  Given a 
matrix of 0's and
1's, we want to find some rows that together have exactly one 1 in each 
column.
In other words, their sum is the vector of all 1's.

Here is how to construct the matrix for sudoku:
The rows are indexed by pairs (d,(x,y)), where d is a digit and (x,y) 
are the coordinates of a cell.

The columns are in four groups, each group having 81 columns.
The first group have labels (d,x), where d is a digit and x labels a row 
of the puzzle.
The second group have labels (d,y), where d is a digit and y labels a 
column of the puzzle.
The third group have labels (d,b), where d is a digit and b labels a box 
of the puzzle.
The fourth group have labels (x,y). where (x,y) are the coordinates of a 
cell.

In row (d,(x,y)) we place a 1 in the columns labelled (d,x), (d,y), 
(d,b) where b is the box containing the cell (x,y), and (x,y).  Each row 
has four 1's.

This gives a 729x324 matrix, which I think deserves the name The Sudoku 
Matrix.  (Of course we can do this for n^2 x n^2 sudoku.)

Here's the point:
A valid completed sudoku grid corresponds to a set of 81 rows which 
"cover" in the sense that they collectively have exactly one 1 in each 
column.

A puzzle may be described by a set of rows which extends uniquely to a 
set of 81 rows which cover.
Solving the puzzle is finding the rest of the rows.

(If we drop the 81 columns corresponding to the boxes then we have the 
ordinary Latin squares completion problem.)

One way to efficiently search for a solution in covering problems is by 
a backtracking method of Knuth which he calls "dancing links,"  
mentioned in an earlier post.
See  addendum at  http://spivey.oriel.ox.ac.uk/mike/comp2005/results.html

Solving the puzzle via the matrix puts deductions like "only two numbers 
can go in this cell" and "this number can only go in these two cells" on 
the same footing, and makes any distinction between them artificial.
...
I'm not saying that this is the best way to do it though!  However, if
you wish to do it this way, it is certainly best to first do
resolution on the subset of 1-clauses and 2-clauses (which is very
fast -- there is a linear time algorithm for this).  This corresponds
to making the standard initial deductions.
What would correspond to using trial and error?

I'm impressed that your program solved that puzzle in a fraction of a 
second.
Could it be tweaked to somehow rate the difficulty of a puzzle?

Gary McGuire