I have a collection of $M$ codewords $x = [x_1, x_2,\dots]$, one of which is chosen randomly and is passed through a discrete memoryless channel $W_{Y|X}$. The special thing about $W$ is that for some channel input/output pairs, we have $P_{X|Y}(X=x|Y_i=y_i) = 0$.
This particular decoder does one thing: it keeps a list of the $M$ codewords and as soon as it observes a channel output for which $P_{X|Y}(X=x|Y_i=y_i) = 0$ for some $x$, it removes the codeword from the list of candidates. This procedure continues until $M-1$ codewords have been eliminated.
I am interested in the stopping time associated with the event that $M-1$ codewords have been eliminated, call it $\tau$. For each choice of $M$, there is a corresponding $\mathbb E[\tau] = \ell$ and for each $\ell$ there is a maximum number $M$ of codewords that can be reliably discerned.
So given that I know $\mathbb E[\tau]$ (but no additional moments), I want to know what additional conditions I need to satisfy in order to establish concentration of measure for $\tau$? Ie I want to show that $P(\tau > \mathbb E[\tau]+\delta) \rightarrow 0 $ as $\ell \rightarrow \infty$.