Given a function a convex and smooth function $ f \left( x \right) : \mathbb{R}^{n} \to \mathbb{R} $ with its gradient having the form:
$$ \nabla f \left( x \right) = x - g \left( x \right) $$
Where $ g \left( x \right) : \mathbb{R}^{n} \to \mathbb{R}^{n} $ is also smooth (Each component).
One way to find a stationary point of $ f \left( x \right) $ is to find the Fixed Point (Points) of $ g \left( x \right) $ as it will vanish the gradient and hence must be a stationary point.
Does the case above guarantees the convergence of $ g \left( x \right) $? If not, what constraints should be imposed to guarantee convergence?
I had thought so as it is guaranteed that the Newton Method to work on $ f \left( x \right) $ yet I am not sure about it.
I read about the convergence of Fixed Point Iteration in Elementary Numerical Analysis MATH:3800/CS:3700 Notes, Section 3.4. Yet they are for general functions. I wonder if there is an analysis for the case above.
Remark: This is the result of a joint discussion with @Ze-NanLi.
The conditions for $ g \left( x \right) $ to converge in Fixed Point Iterations are:
The motivation for (2) is:
$$ \left\| \nabla g \left( x \right) \right\| = \sup_{ \left\| y \right\| = 1} \left\| \nabla g \left( x \right) y \right\| = \sup_{ \left\| y \right\| = 1} \sqrt{ {y}^{T} {\nabla g \left( x \right)}^{T} \nabla g \left( x \right) y} < \sup_{ \left\| y \right\| = 1} \sqrt{ {y}^{T} y } = 1 $$
Which is the condition for unique Fixed Point in the range.
Now, 2 conditions to satisfy (2) are:
Since $ f \left( x \right) $ is convex it suggests that $ {H}_{ f \left( \cdot \right) } \left( x \right) \succeq 0 $ where $ {H}_{ f \left( \cdot \right) } $ is the Hessian Matrix of $ f \left( \cdot \right) $. Since:
$$ {H}_{ f \left( \cdot \right) } \left( x \right) = I - \nabla g \left( x \right) \succeq 0 \Rightarrow \nabla g \left( x \right) \preceq I $$
So if we want strict inequality one must demand $ f \left( \cdot \right) $ to be strictly convex.
So, in order to guarantee $ g \left( x \right) $ will converge to Fixed Point one must demand the following:
Alternative Derivation of the Condition $ \nabla g \left( x \right) \succ -I $
Similarly to the analysis the Newton Method as Fixed Point Iteration one could write:
$$ {x}^{k + 1} = {x}^{k} - \left( {x}^{k} - g \left( {x}^{k} \right) \right) $$
Yet by definition $ \nabla f \left( {x}^{k} \right) = {x}^{k} - g \left( {x}^{k} \right) $ so we have Gradient Descent with Step Size of 1.
For a function $ f \left( x \right) $ which is $ L $ Lipschitz Continuous Gradient means:
From the above we have $ 1 \leq \frac{2}{L} \Rightarrow L \leq 2 $ and $ I - \nabla g \left( x \right) \preceq L I \preceq 2 I \Rightarrow \nabla g \left( x \right) \succeq -I $ as needed.