Confusion related to proximal mapping

24 Views Asked by At

I was reading this paper http://machinelearning.wustl.edu/mlpapers/paper_files/NIPS2012_0388.pdf

and I came across this part

enter image description here

I didn't get how the third line came from the second line. Any suggestions?

1

There are 1 best solutions below

0
On BEST ANSWER

The $H_k$-norm is defined as $$ \|x\|_{H_k}^2 = x^TH_kx. $$ Hence, we obtain $$\begin{split} \|(y-x_k) + H_k^{-1} \nabla g(x_k)\|^2 &= ((y-x_k) +H_k^{-1} \nabla g(x_k))^TH_k((y-x_k) +H_k^{-1} \nabla g(x_k))\\ &=(y-x_k)^TH_k(y-x_k) + 2 (y-x_k) ^T\nabla g(x_k) + \nabla g(x_k)^TH_k^{-1}\nabla g(x_k). \end{split}$$ Observe that the last addend is independent of $y$, so \begin{multline*} \arg \min_y\left\{(y-x_k)^TH_k(y-x_k) + 2 (y-x_k) ^T\nabla g(x_k) + \nabla g(x_k)^TH_k^{-1}\nabla g(x_k)\right\}\\=\arg \min_y\left\{(y-x_k)^TH_k(y-x_k) + 2 (y-x_k) ^T\nabla g(x_k) \right\}. \end{multline*}