According to wikipedia and here and here, the lower bound on the probability that an element is $\epsilon$-typical is $1-\epsilon$ for sufficiently large $n$, and this is derived by using the asymptotic equipartition property which states that the limit of this probability as $n$ goes to infinity is $1$. But why $1-\epsilon$? Can't we pick any number less than n as a lower bound? Why not $1-\frac{\epsilon}{2}$ which is higher?
Am I misunderstanding how the AEP is applied? Or is there a practical reason why this lower bound is chosen instead of a better one?
For clarification, the standard result given in all the resources linked above is:
For sufficiently large $n$:
$$\Pr(x^{n}\in A_\epsilon^{(n)})\geq 1-\epsilon$$
But for sufficiently large $n$, every lower bound is true. Why choose the $\epsilon$ in $1-\epsilon$ to be the same as the $\epsilon$ in $A_\epsilon^{(n)}$? I am yet to see a compelling reason for choosing this than a better bound.
Thanks so much.
In mathematical statements such as this $\epsilon>0$ is arbitrary and can be chosen to be any positive value, however small. That's the whole point.
You saying choose $\epsilon/2$ is just redefining the name of the variable since if $\epsilon$ is arbitrarily small and positive so is $\epsilon/2.$
Edit: Incorporating the well written comment by @curlycharcoal in this answer may help the OP:
$\epsilon$ is arbitrary, but fixed. We pick an $\epsilon$ and for that fixed $\epsilon$, we can find a large enough $n$, call it $N(\epsilon)$, such that $\forall n\geq N(\epsilon)$ the $1−\epsilon$ bound holds. You're right, we can still work this for a $1−\frac{\epsilon}{2}$ bound, but that would correspond to $\epsilon/2-$typicality and hence a larger than $N(\epsilon)$ minimum for $n$, say $N(\epsilon/2)>N(\epsilon)$.
Practically, you want only a good enough $\epsilon$ instead of as small as possible because that bump in reliability comes at a cost of $N(\epsilon/2)-N(\epsilon)$ delay in your transmission, which may not be worth it. T
Also, increased blocklength $N(\epsilon)$ also results in a larger complexity decoding algorithm. This complexity is sometimes growing exponentially in $N(\epsilon)$ for probabilistic search based decoding algorithms.