How to derive the M-step of this EM algorithm.

55 Views Asked by At

I have been reading a paper: Wang X, Golbandi N, Bendersky M, Metzler D, Najork M. Position bias estimation for unbiased learning to rank in personal search.

I am wondering how they derive the M-step in an EM algorithm that has been applied to the problem in their paper.

This is the data log-likelihood function: $$ \log P(\mathcal{L})=\sum_{(c, q, d, k) \in \mathcal{L}} c \log \theta_{k} \gamma_{q, d}+(1-c) \log \left(1-\theta_{k} \gamma_{q, d}\right) . $$ where observed data are (c:click, q:query, d:document, k:result position) rows, and $\theta_{k}$ is the probability a search result at rank $k$ is examined by the user (assumed to be independent of query-document pair), and $\gamma_{q, d}$ is the probability a search result, i.e., a query-document pair is actually relevant (assumed to be independent of result position).

Here are some hidden variable probability expressions (E means examination and R means relevant, they assume that a click event implies a search result is both examined and relevant): $$ \begin{array}{l} P(E=1, R=1 \mid C=1, q, d, k)=1 \\ P(E=1, R=0 \mid C=0, q, d, k)=\frac{\theta_{k}^{(t)}\left(1-\gamma_{q, d}^{(t)}\right)}{1-\theta_{k}^{(t)} \gamma_{q, d}^{(t)}} \\ P(E=0, R=1 \mid C=0, q, d, k)=\frac{\left(1-\theta_{k}^{(t)}\right) \gamma_{q, d}^{(t)}}{1-\theta_{k}^{(t)} \gamma_{q, d}^{(t)}} \\ P(E=0, R=0 \mid C=0, q, d, k)=\frac{\left(1-\theta_{k}^{(t)}\right)\left(1-\gamma_{q, d}^{(t)}\right)}{1-\theta_{k}^{(t)} \gamma_{q, d}^{(t)}} \end{array} $$

The paper has derived the M-step update formulas: $$ \begin{array}{l} \theta_{k}^{(t+1)}=\frac{\sum_{c, q, d, k^{\prime}} \mathbb{I}_{k^{\prime}=k} \cdot(c+(1-c) P(E=1 \mid c, q, d, k))}{\sum_{c, q, d, k^{\prime}} \mathbb{I}_{k^{\prime}=k}} \\ \gamma_{q, d}^{(t+1)}=\frac{\sum_{c, q^{\prime}, d^{\prime}, k} \mathbb{I}_{q^{\prime}=q, d^{\prime}=d} \cdot(c+(1-c) P(R=1 \mid c, q, d, k))}{\sum_{c, q^{\prime}, d^{\prime}, k,} \mathbb{I}_{q^{\prime}=q, d^{\prime}=d}} \end{array} $$ however, how to get these two formulas?

Note: M-step is usually derived from this formula: $$ {\displaystyle Q({\boldsymbol {\theta }}\mid {\boldsymbol {\theta }}^{(t)})=\operatorname {E} _{\mathbf {Z} \mid \mathbf {X} ,{\boldsymbol {\theta }}^{(t)}}\left[\log L({\boldsymbol {\theta }};\mathbf {X} ,\mathbf {Z} )\right]\,} $$

$$ {\displaystyle {\boldsymbol {\theta }}^{(t+1)}={\underset {\boldsymbol {\theta }}{\operatorname {arg\,max} }}\ Q({\boldsymbol {\theta }}\mid {\boldsymbol {\theta }}^{(t)})\,} $$ where $\boldsymbol \theta$ are parameters (i.e., $\theta_{k}$ and $\gamma_{q, d}$ in this case), $\mathbf {X}$ are data (c,q,d,k in this case), and $\mathbf {Z}$ are hidden variables ($E$ and $R$?).

1

There are 1 best solutions below

0
On

Actually, I have found how to reach those updated formulas, although the final form looks a bit different, I am happy to share here in case anyone can find out why.

First, write Q function in more separated cases: $$ Q(\theta_k, \gamma_{q,k} \mid \theta_k^{(t)}, \gamma_{q,k}^{(t)}) = \sum_{X = c,q,d,k} \bigg[ P(E=1, R=1 \mid C=1, q, d, k) \cdot c \cdot \log(\theta_k \gamma_{q,d}) + P(E=1, R=0 \mid C=0, q, d, k) \cdot (1 - c) \cdot \log(\theta_k (1 - \gamma_{q,d})) + P(E=0, R=1 \mid C=0, q, d, k) \cdot (1 - c) \cdot \log((1 - \theta_k) \gamma_{q,d}) + P(E=0, R=0 \mid C=0, q, d, k) \cdot (1 - c) \cdot \log((1 - \theta_k) (1 - \gamma_{q,d})) \bigg] $$

Then simply take the derivatives: $$ 0 = \dfrac{\partial}{\partial \theta_k} Q(\theta_k, \gamma_{q,k} \mid \theta_k', \gamma_{q,k}') $$

The final theta update rule looks like: $$ \theta_k^{(t+1)} = c + P(E=1, R=0 \mid C=0, q, d, k) \cdot (1 - c) $$