From Kullback-Liebler divergence of matrix factorization; \begin{equation*} \mathrm{X}\approx\mathbf{WH} \tag{1} \end{equation*}
\begin{equation*} d_{\mathrm{KL}} (\mathbf{X}\ \vert\vert \mathbf{WH})=\sum_{f,t}\left(X_{f,t}\log\frac{X_{f,t}}{WH_{f,t}}-X_{f,t}+WH_{f,t}\right) \tag{2} \end{equation*}
\begin{equation*} \stackrel{\hbox{cst.}}{\hbox{=}} \sum_{f,t}-X_{f,t}\log\sum_{k}{WH_{f,t}} + \sum_{f,t}\sum_{k}{WH_{f,t}} \tag{3} \end{equation*}
How is constant in $(3)$ derived from $(2)$? Why does $X_{f,t}\log{X_{f,t}}-X_{f,t}$ disappear?
Reference (page 77)