I am studying information theory on "Elements of Invormation theory" (Cover Thomas). I cannot understand the meaning of "to first order in exponent" in the following theorem: .................................................................
Theorem 11.1.4 (Probability of type class) for any $P ∈ \mathcal{P_n}$ n and any distribution $Q$, the probability of the type class $T(P)$ under $Q^n$ is $2^{−nD(P ||Q)}$ to first order in the exponent. More precisely, $$ \frac{1}{(n+1)^{\chi}}2^{−nD(P ||Q)}\leq Q^n(T(P))\leq2^{−nD(P ||Q)} $$ .................................................................
At first I tried to interpret "to first order in the exponent" as $$ Q^n(T(P))\in[2^{−nD(P ||Q)}-nD(P ||Q),2^{−nD(P ||Q)}+nD(P ||Q)] $$ but then I cannot get the inequality with those 2 bounds into the theorem. Any help?
I was reading this same section and was puzzled by this as well but was able to find a resource:
As found in Definition 3 in these lecture notes,
As an example, take $a_n = f(n) e^{g(n)}$ and $b_n = h(n) e^{l(n)}$. Then, $$\lim_{n\to\infty} \frac{1}{n}\log\frac{a_n}{b_n} =\lim_{n\to\infty}\left( \frac{1}{n}\log\frac{f(n)}{h(n)} + \frac{1}{n}(g(n) - l(n))\right)$$ If $f$ and $h$ grow more slowly than exponentially the first term will vanish. For the second to vanish, taking $g$ and $p$ to be polynomial in $n$ or considering their expansions, they must agree to all powers of $n$ above and including the first power.
Essentially this is just saying that the linear terms in the exponent agree. As this is concerning the dominant scaling it also requires that if higher order ($n^2$ etc.) terms exist, they also agree but terms like $n^{-1}$ are not required to agree.
In the case of the bounds given above, as $Q^n(T(P))$ is bounded by an exponential and the same exponential multiplied by a term polynomial in $n$, the bounds agree to first order in the exponent and we can thus say $$Q^n(T(P)) \doteq 2^{-nD(P||Q)}$$