I distinctly remember the news in 2007 about how researchers at Northwestern University had established the upper bound on moves needed to solve a Rubik's Cube at 26. (I haven't been paying attention since, but it's obviously a lot lower now.)
Given: The number of possible states of a proper Rubik's Cube is
$$ 11! \cdot 8! \cdot 3^{8} \cdot 2^{12} = 43,252,003,274,489,856,000 ~~ (4.3252 \times 10^{19}) $$
Using the quarter-turn metric, there are only six (6) possible moves available for any cube at any time (using Singmaster notation):
$$ \text{L, R, U, D, F, and B} $$
Assuming that:
- only the quarter-turn metric is used (e.g. $\text{F}2$ is considered two moves, F F),
- all moves are in one direction only (e.g. $\text{R'}$ would be accomplished by $\text{R R R}$ [or $\text{R}3$]),
- applying a move and subsequently its inverse would be ignored stipulated by (2) above, and also ignoring the same move 4 times (to yield the same original state prior to the 4 moves),
then: a decision tree can be made where each node represents a unique (valid) state. This would be a 6-ary decision tree, with each branch representing each of the 6 basic moves above.
From my CS education, I know that any decision tree with $n$ nodes and $b$ daughter branches per node will have at most $\lceil\textbf{log}_{b} \textbf{n}\rceil$ levels. For states of a Rubik's Cube, this is
$$ \log_{6} (11! \cdot 8! \cdot 3^{8} \cdot 2^{12}) \cong \lceil 24.8473 \rceil = 25 ~\text{levels} $$
What this means is that one can shuffle a Rubik's Cube from any state to any other state (e.g. the solved cube) in not more than 25 moves.
Even though many symmetries and optimizations noted in code20.org show that the max number of moves is lower, my question is regarding the Northwestern Univ. discovery in particular.
Given that
- the Rubik's Cube was invented in 1974 and publicly introduced in 1980,
- decision trees in computer science have been around since at least the 1970s, and
- the Northwestern University research (2007) gave 26 moves as a newly-established upper bound (in agreement with my calculation above),
then my question is:
What took so long?
Is there something lacking in my math above? Or was the complexity of solving the cube not fully understood?
One last side note: I distinctly remember this research announcement in particular as I was teaching a CS lab in grad school at the time, and I discussed this topic with my students. The Northwestern press release seemed to imply that their discovery was a "really big deal", yet the the mathematical analysis seemed obvious to me at the time. Strange...
For the record, I have read the various Rubik's cube questions (and answers) here on math SE (and also CS SE and Stack Overflow), but I couldn't find anything about this topic in particular.
The nodes in your tree you described are not unique. Sure you can fit 43 quintillion nodes in 25 levels, but because your tree allows for duplicates, you can't be sure that you got all unique positions of the cube.
For example consider the node reached by turning RL compared to the node reached by turning LR.
I think that's what you were getting at with your Rule #3: trying to delete obvious duplicates, but in that case each node no longer has 6 children, therefore you no longer can do $\lceil\log_b n\rceil$. For example how many children does the node reached by RRR have?
FWIW you don't need to define a one-direction-only metric to do this (unfortunately flawed) analysis. In regular half-turn metric, each node will have 18 children. So $\lceil\log_{18} n\rceil=\lceil15.6428\rceil=16$, which is way less than what we know is required, due to so many duplicate nodes being in our tree.
(Credit where credit is due: this is basically @Arthur's answer he gave in the comments.)