Application of (Solomonoff) Algorithmic Probability formula?

255 Views Asked by At

Ray Solomonoff gives the Algorithmic Probability formula as,

$$ P_M(x)=\sum_{i=1}^{\infty}2^{-|s_{i}(x)|} \tag{1} $$

​​​If I understand the formula correctly, $M$ is a Turing machine that outputs strings of symbols $x(n)$ with length $n$. Furthermore, $s$ (with length $n$) is a description of $x(n)$ in such a way that the symbols that are output by $M(n)$ would be the same as $x(n)$.

Thus, when computing Eq. (1), is $P_M$ the probability of observing (finite) string $x$ originating from Turing machine $M$ given observed strings $s_i$? Am I on the right track?

Does this mean that the Algorithmic Probability formula, given in Eq. (1), can be used to forecast any data (e.g. weather, population growth etc.)? If so, a simple mathematical example of it being used would be super appreciated!

1

There are 1 best solutions below

4
On BEST ANSWER

Let $M$ indicate a prefix-free Universal Turing Machine, which can output strings using symbols in set $A$. Let $M(s)$ indicate the output of this machine on input program $s \in \{0,1\}^*$. Then, the algorithmic probability of a finite string $x \in A^*$ (also sometimes known as the universal prior probability of $x$) is given by \begin{align} P_M(x) = \sum_{s: M(s) = x} 2^{-|s|} \,, \end{align}
where $|s|$ indicates the length of program $s$ in bits.

$P_M(x)$ is the probability that machine $M$ would produce $x$ and halt ---if it were fed with random inputs (e.g., as generated by randomly flipping a coin for each input bit). This probability is a special kind of "prior distribution", which assigns larger probabilities to $x$ that can be generated by short programs. In fact, $P_M(x) = 2^{-K(x)}$, where $K(x)$ is the (prefix) Kolmogorov complexity of $x$ and equality holds up to a multiplicative constant (which doesn't depend on $x$).

Solomonoff induction is a way of using the distribution $P_M$ to make predictions, in a way which provides some special mathematical properties. Imagine that you observed some binary string $x = \langle x_1, x_2, \dots , x_{n-1}\rangle$ and you wish to predict whether the next bit will be $x_n = 0$ or $x_n = 1$. Solomonoff induction suggests choosing the continuation that will make the entire sequence from $x_1$ to $x_n$ have highest algorithmic probability: \begin{align} x_n = \operatorname{argmax}_{x_n' \in \{0,1\}} P_M(\langle x_1, x_2, \dots , x_{n-1}, x_n' \rangle) \end{align} More generally, one can make predictions by applying Bayes rule to the prior distribution $P_M$ (usually while also normalizing it appropriately, see Solomonoff's papers on the subject).