How to show that the gradient descent for unconstrained optimization can be represented as the argmin of a quadratic?

879 Views Asked by At

In the FISTA paper, it writes:

enter image description here

I have literally been stuck on this line for days. I don't have access to the text and I couldn't find any information online about this equivalence. Can someone please provide some guidance as to why these two expressions are equivalent?

1

There are 1 best solutions below

4
On BEST ANSWER

Define $$g(x) = f(x_{k-1}) + \langle x - x_{k-1}, \nabla f(x_{k-1})\rangle + \tfrac{1}{2t_k}\|x-x_{k-1}\|_2^2$$ Note that $x_{k-1}$ and $t_k$ are constants in this context; the only variable is $x$. This is obviously a strongly quadratic function, so it is guaranteed to have a unique minimum. And it is differentiable, so we can determine that minimum by finding the point with zero gradient: $$\nabla g(x) = \nabla f(x_{k-1}) + \tfrac{1}{t_k}(x-x_{k-1}) = 0$$ Isolate $x$ and you do indeed get $$x = x_{k-1} - t_k \nabla f(x_{k-1})$$ In fact, with a bit of care, you can show that $$g(x) = f(x_{k-1}) + \tfrac{1}{2t_k}\|x-(x_{k-1}-t_k\nabla f(x_{k-1}))\|_2^2 - \tfrac{t_k}{2}\|\nabla f(x_{k-1})\|_2^2$$ which obviously obtains its minimum at the given point, and that minimum value is $$g(x_k) = f(x_{k-1}) - \tfrac{t_k}{2}\|\nabla f(x_{k-1})\|_2^2$$