I was watching the trust region algorithm implementation in the dlib code, optimization_trust_region.h, where it is stated that
"This is an implementation of algorithm 4.3 (Trust Region Subproblem) from the book Numerical Optimization by Nocedal and Wright. Some of the details are also from Practical Methods of Optimization by Fletcher."
I started to compare both algorithms and found mismatch which I can't explain. The heart of the algorithm is the 1-d root-finding procedure for some scalar function $\varphi$. The formulae for the iteration step mismatch and that's how.
We have: $$ \varphi(\lambda) = \frac{1}{\Delta} - \frac{1}{\|p\|} = \frac{1}{\Delta} - \big(p^T p\big)^{-1/2}. $$ Hence, $$ \varphi'(\lambda) = \frac12 \big(p^T p\big)^{-3/2} \cdot 2 p^T p' = \frac{p^T p'}{\|p\|^3}. $$ What about $p' \equiv \dfrac{dp}{d\lambda}$? We have $$ (B + \lambda I)p = -g, $$ and this means that, differentiating by $\lambda$, we get $$ p + (B + \lambda I)p' = 0. $$ Remember that in algorithm 4.1 we factorize the matrix $(B + \lambda I)$ via Cholesky decomposition: $(B + \lambda I) = R^T R$, and this means that $$ R^T R p' = -p, $$ and if we denote $Rp' = q$ (which means that $q$ is the solution of the equation $R^T q = -p$), then we have $$ p^T p' = (p, p') = -(R^T R p', p') = -(Rp', Rp') = -\|Rp'\|^2 = -\|q\|^2. $$ Combining all this together we come to the conclusion that $$ -\frac{\varphi(\lambda)}{\varphi'(\lambda)} = -\frac{\|p\| - \Delta}{\Delta \|p\|}\cdot \frac{\|p\|^3}{-\|q\|^2} = \bigg(\frac{\|p\|}{\|q\|}\bigg)^2\bigg(\frac{\|p\| - \Delta}{\Delta}\bigg) $$ in perfect accordance with the page~87 of the book.
If we turn to the dlib code however, we see that: a) their $R$ is our $R^T$ (because their Cholesky factorization looks $RR^T$ instead of $R^T R$); and b) they define the vector $q$ differently, as the solution of the equation $Rq = -g$ which means that their $q = R^T p$.
So, if we use Nocedal-Wright notation, we can compare the two approaches: $$ \begin{array}{l|c|c} & \text{Nocedal-Wright} & \text{dlib library} \\ \hline \text{Eq. for } p: & R^T Rp = -g & \text{the same} \\ \hline \text{Eq. for } q: & R^T q = -p & R^T q = -g \\ \hline \text{Definition of } q: & q = Rp' & q = Rp \\ \hline \text{Iteration step:} & \bigg(\dfrac{\|p_{\ell}\|}{\|q_{\ell}\|}\bigg)^2\bigg(\dfrac{\|p_{\ell}\| - \Delta}{\Delta}\bigg) & \bigg(\dfrac{\|q_{\ell}\|}{\|p_{\ell}\|}\bigg)^2\bigg(\dfrac{\|p_{\ell}\| - \Delta}{\Delta}\bigg) \\ \hline \end{array} $$
How could one make sense of the discrepancy? Could both approaches be correct, and how?