Gradient descent with linear perturbation

352 Views Asked by At

Given a convex, differentiable function $f$ (from a Hilbert space to $\mathbb{R}$) with a minimum (say $x^*$), I know you can find $x^*$ using gradient descent. Suppose now that you apply gradient descent to a linear perturbation of $f$: \begin{equation} \hat{f} : x \mapsto f(x) + \langle x, w \rangle \end{equation} ($w$ being a small vector in the Hilbert space). I know both the infimum value of $\, \hat{f}$ and its minimum point (if it exists at all) may be very different from that of $f$ but it seems to me that the difference between the gradients at each iteration $\|\nabla f (x_n) - \nabla \hat{f} (\hat{x}_n)\| $ should stay of the order of $\mathcal{O} (\|w\|)$.

Any idea if this is indeed the case and why ? I would be forever grateful !

1

There are 1 best solutions below

0
On

Indeed for any $x$, we have $\nabla \hat{f}(x) = \nabla f(x) + w$ and so $\|\nabla f(x) - \nabla \hat{f}(x)\| = \|-w\| = \|w\|$.