What does an optimal Turing Machine mean?

90 Views Asked by At

Let $M$ be a TM, and let $x \in \sum^*$. The plain Kolmogorov complexity of x with respect to m is -

$C_{M}(x) = min\{|\pi|:\pi \in \sum^* \land M(\pi) = x \}$

A TM U is optimal if, for all TM M there is a constant $c_M \in{N}$ such that for all strings $x \in \sum^*$ -

$C_{U}(x) \leq C_{M}(x) + c_M $

I understand plain Kolmogorov complexity as the minimum number of bits ($\pi$) it takes for a machine M to print x. However, I am stumped by the second definition of optimal TM.

1

There are 1 best solutions below

0
On BEST ANSWER

What it means intuitively is that even if some $M$ is able to produce a few strings more efficiently than $U$ can, none can do systematically better.

It's probably helpful to think of some Turing machines that are not optimal according to this definition. For instance, let $M$ be the turing machine that, regardless of input, returns $1001001101101$. Then we have $$C_M(x) = \begin{cases}0,& x = 1001001101101\\\infty,& \text{otherwise}\end{cases}.$$

On the other hand, if $U$ is a universal TM, then we probably have $C_U(1001001101101)>0,$ but if we take $c_M=C_U(1001001101101),$ then we have $C_U(x) \le C_M(x)+c_M,$ so $U$ satisfies the optimality condition relative to $M$. This is a good example to illustrate how we need some wiggle room in the optimality condition... we can't expect a TM do always be more efficient on every possible string.

For a more interesting example that goes deeper into the concept, let $M$ be a TM that implements the identity function. Then we have $C_M(x) = |x|.$ If $U$ is a universal TM, then for any string $x$ we can let $\pi(x)$ be the code for the Turing machine implementing the program that "hard-codes" $x$ and returns it (in some sensible, fixed way). Under any reasonable coding, there's a constant $c$ such that $|\pi(x)|\le |x|+c$ for all strings $x.$ Thus we have $C_U(x) \le |x|+c=C_M(x)+c.$

But it's not the case that there is a constant $d$ such that $C_M(x) \le C_U(x)+d$ since often-times we can do much better than just hard-coding $x.$ For instance, if $x_N$ is the string of $N$ zeros for some very large $N$, one could imagine writing a python program that returns this that uses on the order of $\log N$ characters (e.g. return 1000000 * "0" returns a string with $1000000$ zeros), and so that translates to $C_U(x_N)\le \log(N)+k,$ (where $k$ is just a stand-in for however many extra bits of overhead we happen to need to encode the "append zero _ times" aspect). Thus for any constant $d,$ we can choose $N$ large enough so that $N-\log(N)> d+k$ and then we have $C_M(x_N)> C_U(x_N)+d,$ and we see $M$ fails the optimality criterion relative to $U$, whereas we already showed $U$ passes the optimality criterion relative to $M$.