I am self-studying convex optimization by Boyd and Vandenberghe and I am currently facing some problems in solving Exercise Chapter 3.44(a). Thus, I would like to discuss about this question. The question is given as follows:
3.44) Second-order conditions for quasiconvexity. In this problem we derive alternate representations of the second-order conditions for quasiconvexity given in Section 3.4.3. Prove the following.
a) A point $x\in \mathbf{dom} f$ satisfies $$y^T\nabla f(x)=0 \Rightarrow y^T\nabla^2 f(x)y\geq0$$ if and only if there exists a $\sigma$ such that $$\nabla^2f(x)+\sigma\nabla f(x) \nabla f(x)^T \geq 0$$
Hint. We can assume without loss of generality that $\nabla^2 f(x)$ is diagonal.
To lighten the notation, let $a=\nabla f(x)$ and $B=\nabla^2 f(x)$.
My attempt is: If $y^Ta=0 \Rightarrow y^TBy\geq0$, then $y^T(B+\sigma aa^T)y=y^TBy+\sigma y^Taa^Ty=y^TBy\geq0$ for any $\sigma \in \mathbb{R}$.
Suppose there exists a $\sigma$ such that $B+\sigma aa^T \geq 0$, then we have $y^T(B+\sigma aa^T)y=y^TBy+\sigma \Vert a^Ty \Vert^2\geq0$. If $a^Ty=0$, then $y^TBy+\sigma \Vert a^Ty \Vert^2=y^TBy\geq0$ for any $\sigma \in \mathbb{R}$.
If $a^Ty\neq0$, we can conclude that $\Vert a^Ty \Vert^2>0$. Under this assumption, we have to show there exists no such $\sigma$ where $y^TBy<0$ and $B+\sigma aa^T \geq 0$. Based on the hint, we assume that $B$ is diagonal. Furthermore, the earlier condition suggests that $b_1,...,b_n \leq 0$, and $y^TBy=\sum b_iy_i^2< 0$. On the other hand, we have $$\sum b_iy_i^2+\sigma \Vert a^Ty \Vert^2\geq0$$ $$\sigma \geq \frac{-\sum b_iy_i^2}{\Vert a^Ty \Vert^2}>0$$
This implies that $\sigma> 0$.
As seen above, if the job is to argue the case of "if and only if there exists any $\sigma\in \mathbb{R}$", then I believe I'm done because the last case shows that $\sigma\in \mathbb{R_{++}}$. Nevertheless, I think the authors are trying to say exactly what they wrote. In that case, my proof is not complete and my entire approach might be wrong. Thus, I would like to know how can I prove the above?
Some thoughts
- I have a feeling that the statement should be "... if and only if there exists any $\sigma \in \mathbb{R}$ such that ...". If this is true, then I can see why $B$ has at most one negative eigenvalue, which is related to Ex. 3.44(b).
- I have taken a look at the solution but I think there are some mistakes as they derived the Schur Complement of $C_{nn}$ of $C$, where $C=\mathbf{diag}(b)+\sigma aa^T$. Specifically, they showed the Schur Complement of $C_{nn}$ is $$C_{nn}= \mathbf{diag}(\bar b)+\sigma\bar a \bar a^T-\frac{a_n^2}{b_n+\sigma a_n^2}\bar a \bar a^T= \mathbf{diag}(\bar b)+\frac{a_n^2\sigma^2+b_n\sigma-a_n^2}{b_n+\sigma a_n^2}\bar a \bar a^T,$$ where $\bar a=(a_1,...,a_{n-1}),\bar b=(b_1,...,b_{n-1})$. Nevertheless, I found that the Schur Complement of $C_{nn}$ of $C$ should be $$C_{nn}= \mathbf{diag}(\bar b)+\sigma\bar a \bar a^T-\frac{\sigma^2a_n^2}{b_n+\sigma a_n^2}\bar a \bar a^T= \mathbf{diag}(\bar b)+\frac{\sigma b_n}{b_n+\sigma a_n^2}\bar a \bar a^T.$$ If there were no mistakes in my derivation, then this equation leads to the fact that $\sigma = 0$ and I can find a contradiction in the argument.
In short, this question really puzzled me.