What is measure theoretical entropy in multidimensional symbolic dynamical systems?

556 Views Asked by At

Can any one describe the term entropy used in dynamical systems, and what is roll of entropy in symbolic dynamical systems and please give the brief introduction on measure theoretical entropy?

1

There are 1 best solutions below

2
On

Let me first start with a one-dimensional dynamics, defined by a measurable map $T:X\to X$, with an invariant probability measure $\mu$. The entropy of $T$ is the average amount of information per iteration you could gain by observing a typical trajectory of $T$. To formulate this, consider a point $x\in X$ chosen at random according to $\mu$ (a "typical point") and observe the trajectory of $x$ using a function $f:X\to\Sigma$ (for some alphabet $\Sigma$). Observing the trajectory for $n$ iterations, you would gain a sequence $a_0a_1\cdots a_n$ of $n+1$ of random symbols from $\Sigma$ that tells you something (not everything) about where $x$ lies in $X$. You can measure the amount of information you gained from this observation by the entropy $H(a_0a_1\cdots a_n)$. The average information per iteration you gain by observation through $f$ would be $$h(T\,|\,f):= \lim_{n\to\infty}\frac{H(a_0a_1\cdots a_n)}{n+1}\;,$$ where $a_i:=f(T^i x)$. This of course depends on how refined the observation function $f$ is. If $f$ is too detailed, you gain a lot of information in a single observation (the entropy of $a_i$ alone would be large), but that information would help you to predict the subsequent observations $a_{i+1},a_{i+2},\ldots$ (hence smaller conditional entropy for $a_{i+1}$ given $a_i$ and so forth). If $f$ is too coarse, you gain little information from each observation $a_i$, but the different observations become more independent. The entropy $$h(T):= \sup_f h(T\,|\,f)$$ is the highest information gain per iteration you could hope for.

The entropy of a multidimensional dynamics is conceptually the same thing. Say, you have two commuting maps $S,T:X\to X$ and a measure $\mu$ that is invariant under both $S$ and $T$. This time, you gather a two-dimensional array $(a_{i,j})$ of observations, where $a_{i,j}:=f(S^i T^j x)$. So, the entropy relative to an observation $f:X\to\Sigma$ will be the limit average $$h(S,T\,|\,f):= \lim_{n\to\infty}\frac{H(a_{i,j}: 0\leq i,j\leq n)}{(n+1)^2}\;,$$ over squares (or a different family of nice regions).

For (multi-dimensional) shift dynamical systems, the entropy becomes simply entropy per symbol. Say, $\Sigma$ is a finite alphabet, $\sigma$ is the action of shift on $\Sigma^{\mathbb{Z}^d}$, $X\subseteq\Sigma^{\mathbb{Z}^d}$ is a closed shift-invariant subset, and $\mu$ a shift-invariant probability measure on $X$. Then, the entropy of $(X,\sigma,\mu)$ turns out to be the same as the average entropy per symbol $$h(X,\sigma,\mu):=\lim_{n\to\infty}\frac{H(x_{i_1,i_2,\ldots,i_d}: 0\leq i_1,i_2,\ldots,i_d\leq n)}{(n+1)^d}$$ of a random configuration $x$. That is, the observation $f(x):=x_0$ (observing the symbol at the origin) is optimal.

There are other useful ways to think about the entropy. See e.g. this post.

As for the role of entropy in (symbolic) dynamics, here are two important applications:

First, entropy could be used to distinguish different systems (and this is why it was originally introduced by Kolmogorov and Sinai): if two systems $(X,T,\mu)$ and $(X',T',\mu')$ are isomorphic, their entropy is the same. So, entropy could be used to show that two systems are not isomorphic. For example, two shift systems with unequal entropies are not isomorphic. Interestingly, entropy completely classifies the Bernoulli shifts: Ornstein proved that two Bernoulli shifts with the same entropy are isomorphic.

A second application of entropy is to pick measures in a family of invariant measures that are "most random". For example, among all the invariant measures of a full shift $(\Sigma^\mathbb{Z},\sigma)$, the uniform Bernoulli measure (i.e., uniform at each position, independent at different positions) has the highest entropy (and is the only one with this property), and this is arguably the most random measure we can have on the full shift. This idea comes from statistical mechanics, where the most random measures (under the constraints of the model) are regarded as the macroscopic equilibrium states.