I am struggling to understand the derivation of the log likelihood in the proof of Lemma 1 in Kauffman14 (https://arxiv.org/pdf/1407.4443.pdf).
I will give a bit of context: the lemma is about a lower bound on the sample complexity for multi-armed bandit through a change of measure argument. The log-likelihood in the derivation after $t$ rounds where the agent chooses arm $A_t$ and observe reward $Z_t$ is stated as $$L_{t}=L_{t}\left(A_{1}, \ldots, A_{t}, Z_{1}, \ldots, Z_{t}\right):=\sum_{a=1}^{K} \sum_{s=1}^{t} \mathbb{1}_{\left(A_{s}=a\right)} \log \left(\frac{f_{a}\left(Z_{s}\right)}{f_{a}^{\prime}\left(Z_{s}\right)}\right)$$
My question is concerns how to write this from what I know to be the log likelihood ration
$$L_{t}=L_{t}\left(A_{1}, \ldots, A_{t}, Z_{1}, \ldots, Z_{t}\right):= \log \left(\frac{f_{a}\left(A_1,Z_1,\dots,A_t,Z_t\right)}{f_{a}^{\prime}\left(A_1,Z_1,\dots,A_t,Z_t\right)}\right)$$
Do they apply some sort of conditioning or some martingale properties on the expectation of the sigma algebra generated by the observations? It would be very helpful to see the complete derivation. Thanks a lot!
They may have used three steps