I am reading a paper that goes:
Here tau accounts for timing and sensor inaccuracies (inherent of the operating system available on mobile phones) by providing a decaying velocity model, preventing unwanted drift at small accelerations [...]
My question is:
how can that be a decaying model, if the previous velocity is summed to something? If tau is a constant, I do not see any value in R that can always satisfy what they are saying.
What am I missing?

You are right. The velocity does not necessarily decay. I think the word "damping model" might be a better name. Because it essentially adds a damping term to the acceleration such that the velocity does not change that much.
Let's say the original model is
$\vec{v}^{k+1} = \vec{v}^{k} + \Delta t R_B(\vec{a}_B^k - g_B)$,
and the "damping model" is
$\vec{v}_I^{k+1} = \vec{v}_I^{k} + \tau \Delta t R_B(\vec{a}_B^k - g_B)$,
then you have \begin{equation} \frac{|\vec{v}_I^{k+1} - \vec{v}_I^{k}|}{|\vec{v}^{k+1} - \vec{v}^{k}|}= \tau < 1. \end{equation}