Proving convexity for multivariable function

5.3k Views Asked by At

I am trying to prove the convexity for the following function: $$f(x) = \sqrt{(a^Tx)^2 + (b^Tx)^2}$$ where $\{a,b,x \} \in \mathbb{R}^n$. I showed that the Hessian is positive semidefinite (which proves the convexity), but the equations are very complicated, so I'm trying to use the definition of convex functions to prove it (i.e., $$f(tx_1 + (1-t)x_2) \leq tf(x_1) + (1-t)f(x_2); \ \forall t \in [0,1]$$

However, I cannot seem to be able to prove this. Any ideas on how to approach this? Your help is much appreciated!

2

There are 2 best solutions below

3
On BEST ANSWER

Let $f : \mathbb R^n \to \mathbb R$ be defined by

$$f (\mathrm x) := \sqrt{ (\mathrm a^{\top} \mathrm x)^2 + (\mathrm b^{\top} \mathrm x)^2 } = \sqrt{ \mathrm x^{\top} \mathrm a \mathrm a^{\top} \mathrm x + \mathrm x^{\top} \mathrm b \mathrm b^{\top} \mathrm x } = \sqrt{ \mathrm x^{\top} \left( \mathrm a \mathrm a^{\top} + \mathrm b \mathrm b^{\top} \right) \mathrm x }$$

Let $\mathrm Q := \begin{bmatrix} | & |\\ \mathrm a & \mathrm b\\ | & |\end{bmatrix}$. Hence, $\mathrm a \mathrm a^{\top} + \mathrm b \mathrm b^{\top} = \mathrm Q \mathrm Q^{\top}$ is positive semidefinite and

$$f (\mathrm x) = \sqrt{ \mathrm x^{\top} \left( \mathrm a \mathrm a^{\top} + \mathrm b \mathrm b^{\top} \right) \mathrm x } = \sqrt{ \mathrm x^{\top} \mathrm Q \mathrm Q^{\top} \mathrm x } = \| \mathrm Q^{\top} \mathrm x \|_2$$

Exploiting the subadditivity of $\|\cdot\|_2$,

$$f \left( \gamma \mathrm x_1 + (1-\gamma) \mathrm x_2 \right) = \| \mathrm Q^{\top} (\gamma \mathrm x_1 + (1-\gamma) \mathrm x_2) \|_2 \leq \gamma \, \| \mathrm Q^{\top} \mathrm x_1 \|_2 + (1-\gamma) \, \| \mathrm Q^{\top} \mathrm x_2 \|_2$$

Thus, $f$ is convex.

4
On

Starting out from $f(tx_1 + (1-t)x_2) \leq tf(x_1) + (1-t)f(x_2)$ lets square both sides to get rid of the square-root on the left to get \begin{align*} (ta^Tx_1 + (1-t)a^Tx_2)^2 + (tb^Tx_1 + (1-t)b^Tx_2)^2 \leq t^2 ((a^Tx_1)^2 + (b^Tx_1)^2) + (1-t)^2 ((a^Tx_2)^2 + (b^Tx_2)^2) + 2t(1-t)\sqrt{((a^Tx_1)^2 + (b^Tx_1)^2)((a^Tx_2)^2 + (b^Tx_2)^2)}. \end{align*} Now writing out the squares on the left and canceling with right hand side terms leaves \begin{align*} 2ta^Tx_1(1-t)a^Tx_2 + 2tb^Tx_1(1-t)b^Tx_2 \leq 2t(1-t)\sqrt{((a^Tx_1)^2 + (b^Tx_1)^2)((a^Tx_2)^2 + (b^Tx_2)^2)}, \end{align*} where we can also drop the $2t(1-t)$ factor. We can square again both sides to get \begin{align*} (a^Tx_1 a^Tx_2 + b^Tx_1 b^Tx_2)^2 \leq ((a^Tx_1)^2 + (b^Tx_1)^2)((a^Tx_2)^2 + (b^Tx_2)^2). \end{align*} Now multiplying out both sides and canceling leaves \begin{align*} 2 a^Tx_1 a^Tx_2 b^Tx_1 b^Tx_2 \leq (a^Tx_1)^2(b^Tx_2)^2 + (b^Tx_1)^2 (a^Tx_2)^2, \end{align*} which is equivalent to \begin{align*} 0 \leq (a^Tx_1)^2(b^Tx_2)^2 + (b^Tx_1)^2 (a^Tx_2)^2 - 2 a^Tx_1 a^Tx_2 b^Tx_1 b^Tx_2 = \left(a^Tx_1 b^Tx_2 - b^Tx_1 a^Tx_2\right)^2, \end{align*} which now obviously is true and should show the convexity of your function.