I am trying to understand the following proof when using uncertainty ellipsoids in robust approximation. On page 322 of Boyd's Convex Optimization he has the following:
$$A=\{[a_1 ··· a_m]^T |a_i ∈Ei, i=1,...,m\}$$ where $$E_i =\{a ̄_i +P_iu|∥u∥_2 ≤ 1\}$$
He then gives the worst-case magnitude of each residual as
$$sup _{a_i∈E_i} |a_i^T x−b_i| = sup\{|a ̄_i^T x−b_i +(P_iu)^Tx||∥u∥_2 ≤1\}$$
$$ = |a ̄_i^Tx−b_i|+∥P_i^Tx∥_2.$$
I don't understand how to go from step 1 to step 2. Specifically where the last term $∥P_i^Tx∥_2$ comes from. I'm assuming $$sup\{|a ̄_i^T x−b_i +(P_iu)^Tx|\} = |a ̄_i^T x−b_i| +|(P_iu)^Tx| = |a ̄_i^T x−b_i| +|u^TP_i^Tx|$$ since for any norm $$||x+y|| \leq ||x||+||y||$$
I suppose since it is the case that
$$sup_{u,x \neq 0}\frac{u^TP_ix}{||u||_2||x||_2} = sup_{x=0} \frac{||Ax||_2}{||x||_2}$$
The proof might follow. However thought it's given that $||u||_2\leq1$, can we assume that in the supremum $||u||_2 =||x||_2 = 1$? Also can we generally assume that $u^TP_ix = u^TP_i^Tx$? I suspect I am missing something very elementary or taking the wrong approach so any thoughts or insights would be greatly appreciated!