Way of collapsing set of rules into few universal rules - mathematical Occam's razor

60 Views Asked by At

I was thinking about learning rules of a system by just looking at it. Let's say you know nothing about chess and you watch inifinite number of games - you watched all possible transitions of states. If you are dumb(or a neural network) you can write really big set of rules, where each state can evolve into all possible next states and call it rules. It's pretty easy to guess that you would end up with billions of billions of rules - this goes very much against Occam's razor.

Let's simplify the problem then. Let's say we have $x$ by $x$ chessboard and just one tower - we watch infinite moves of this tower. Now we can derive rules of this system, let's say again we are dumb and come up with separate rules that say that from any $x*x$ possible positions we can move to any other $2(x-1)$ positions, but these are all SEPARATE rules. How could we "collapse" this enormous set of rules into something more comprehensible and UNIVERSAL?

I would like to say in advance: I am really sorry for not using a lot of mathematical notation, this problem was a bit more philosophical in my mind and I am not advanced enough in this field of mathematics to write something that makes sense.

1

There are 1 best solutions below

0
On

A 'mathematical Occam's razor' is the minimum description length principle.

Roughly speaking it states that the best model for a set of observations is the number of bits needed to represent that model, plus the number of bits needed to represent the corrections to the model for observations it gets wrong. The model that minimizes this sum is the best model for a set of observations.

Note that this (just like Occam's razor) does not tell you how to find this model. But it can, given two models, tell you which of the two is better.

As for finding these models, well that's an entire field: machine learning. Tons upon tons of methods exist (linear models, decision trees, (convolutional) neural networks, (non-linear) SVMs, rule-based models, etc).