Finding bounds on the norm of the difference between two vectors.

1.8k Views Asked by At

Consider the Normed Vector Space $(V, \left\| \cdot \right\|_{p})$ defined on the Field $\mathbb{R}$ where $V = \mathbb{R}^{n}$ and $\left\| \cdot \right\|$ is the Minkowski p-Norm. Vector addition $\oplus_{V}$ and scalar multiplication $\otimes_{V}$ follows standard definitions, i.e.

$\forall \underline{u},\underline{v} \in V$, $k \in \mathbb{R}$

$\left(\underline{u} \oplus_{V} \underline{v}\right)_{i} = u_{i} + v_{i}$

$\left(k \otimes_{V} \underline{v}\right)_{i} = k \cdot v_{i}$

Define the difference between vectors using

$D\left(\underline{u},\underline{v}\right) = \left\| \underline{u} -\underline{v}\right\|_{p}$

I am trying to find the maximum value this difference can take when $\underline{u} , \underline{v}$ belong in the first Orthant and are of unit length, i.e.

$\operatorname{arg\,max}_{\underline{u}, \underline{v} \in T} D\left(\underline{u},\underline{v}\right) $

Where

$T = \left\{ \underline{v} \in V \: | \: v_{i} \geq 0 \: ,\: \left\|\underline{v}\right\|_{p} = 1 \right\} $

I started initially by applying the Triangle inequality, i.e.

$ D\left(\underline{u},\underline{v}\right) = \left\| \underline{u} -\underline{v}\right\|_{p} \leq \left\| \underline{u} \right\|_{p} + \left\| -\underline{v}\right\|_{p} = \left\| \underline{u} \right\|_{p} + \left\| \underline{v}\right\|_{p}$

Which for vectors in $T$ becomes

$ D\left(\underline{u},\underline{v}\right) \leq 1 + 1 = 2$

The triangle inequality is 'equal' when $\underline{u}$ and $\underline{v}$ are Linearly Dependent, i.e

$\underline{u} = \lambda \left(-\underline{v}\right)$, $\lambda \in \mathbb{R}$

This represents a situation that can not occur for vectors in $T$ due to the elements being positive.

I was hoping I would be able to find a 'tighter' upper bound. I decided to examine a simplified case when $p = 2$

$D\left(\underline{u},\underline{v}\right) = \left\| \underline{u} -\underline{v}\right\|_{2} = \left[\sum_{i=1}^{n}\left|u_{i} - v_{i}\right|^{2}\right]^{\frac{1}{2}} = \left[\sum_{i=1}^{n}\left(u_{i} - v_{i}\right)^{2}\right]^{\frac{1}{2}} = \left[\sum_{i=1}^{n}\left(u_{i}^{2} + v_{i}^{2} - 2u_{i}v_{i}\right)\right]^{\frac{1}{2}}$

$= \left[\sum_{i=1}^{n}u_{i}^{2} + \sum_{i=1}^{n}v_{i}^{2} - 2\sum_{i=1}^{n}u_{i}v_{i} \right]^{\frac{1}{2}} = \left[\left\|\underline{u}\right\|_{2}^{2} + \left\|\underline{v}\right\|_{2}^{2} - 2\sum_{i=1}^{n}u_{i}v_{i}\right]^{\frac{1}{2}}$

Remember that we are dealing with vectors in $T$ which are of unit length, thus,

$D\left(\underline{u},\underline{v}\right)= \left[2 - 2\sum_{i=1}^{n}u_{i}v_{i}\right]^{\frac{1}{2}} = 2^{\frac{1}{2}}\left[1 - \sum_{i=1}^{n}u_{i}v_{i}\right]^{\frac{1}{2}}$

Now as $u_{i}, v_{i}$ are positive here, we observe that the maximum $D\left(\underline{u},\underline{v}\right)$ between $\underline{u},\underline{v}$ occurs when

$\sum_{i=1}^{n}u_{i}v_{i} = 0$

Or, when they share no common positive indices. For $p = 2$ and vectors in $T$ this refers to when the two vectors are Orthogonal based on it's inner product definition.

As such, the bound from before can be updated, so that when $p = 2$

$0 \leq D\left(\underline{u}, \underline{v}\right) \leq \sqrt{2}$

I'm interested to see if this mode of thought applies to all $p$, i.e. that the maximum occurs when they share no common non-zero indexes.

Note - I'm using 'non common non-zero indexes' as I'm not sure 'Orthogonality' can be used as a descriptor in a Vector Space with no Inner Product Defined

Now, when working with the general p-Norm we have

$D\left(\underline{u},\underline{v}\right) = \left\| \underline{u} -\underline{v}\right\|_{p} = \left[\sum_{i=1}^{n}\left|u_{i} - v_{i}\right|^{p}\right]^{\frac{1}{p}} = \left[\sum_{i \in A}\left|u_{i} - v_{i}\right|^{p} + \sum_{i \in B}\left|u_{i} - v_{i}\right|^{p} + \sum_{i \in C}\left|u_{i} - v_{i}\right|^{p} + \sum_{i \in D}\left|u_{i} - v_{i}\right|^{p} \right]^{\frac{1}{p}}$

Where A,B, C and D are given by

$A = \left\{ i \in \mathbb{N} \: | \: u_{i} = 0, v_{i} = 0 \right\}$

$B = \left\{ i \in \mathbb{N} \: | \: u_{i} = 0, v_{i} \neq 0 \right\}$

$C = \left\{ i \in \mathbb{N} \: | \: u_{i} \neq 0, v_{i} = 0 \right\}$

$D = \left\{ i \in \mathbb{N} \: | \: u_{i} \neq 0, v_{i} \neq 0 \right\}$

Hence,

$D\left(\underline{u},\underline{v}\right) = \left\| \underline{u} -\underline{v}\right\|_{p} = \left[\sum_{i=1}^{n}\left|u_{i} - v_{i}\right|^{p}\right]^{\frac{1}{p}} = \left[\sum_{i \in A}\left|u_{i} - v_{i}\right|^{p} + \sum_{i \in B}\left|u_{i} - v_{i}\right|^{p} + \sum_{i \in C}\left|u_{i} - v_{i}\right|^{p} + \sum_{i \in D}\left|u_{i} - v_{i}\right|^{p} \right]^{\frac{1}{p}} = \left[\sum_{i \in A}\left|0 - 0\right|^{p} + \sum_{i \in B}\left|0 - v_{i}\right|^{p} + \sum_{i \in C}\left|u_{i} - 0\right|^{p} + \sum_{i \in D}\left|u_{i} - v_{i}\right|^{p} \right]^{\frac{1}{p}} = \left[\sum_{i \in B}\left|v_{i}\right|^{p} + \sum_{i \in C}\left|u_{i}\right|^{p} + \sum_{i \in D}\left|u_{i} - v_{i}\right|^{p} \right]^{\frac{1}{p}}$

And it is at this point that I'm stuck. I can say that if $D = \emptyset$, i.e. they share no common non-zero indexes, then

$D\left(\underline{u},\underline{v}\right) = \left[\sum_{i \in B}\left|v_{i}\right|^{p} + \sum_{i \in C}\left|u_{i}\right|^{p} \right]^{\frac{1}{p}} = \left[ \left\|\underline{u} \right\|_{p}^{p} + \left\|\underline{v} \right\|_{p}^{p} \right]^{\frac{1}{p}} = 2^{\frac{1}{p}}$

But I have no idea if this is the maximum value at this point. It does however hold for $p = 1,2$.

Does anyone have any suggestions on how to proceed?

Thanks in Advance, David

2

There are 2 best solutions below

1
On

To get some hold on the problem look at the first two coordinates, and assume $$u_1^p+u_2^p=\lambda^p,\quad v_1^p+v_2^p=\mu^p\tag{1}$$ with $\lambda>0$, $\mu>0$. We can let $u':=(u_1,u_2)$ rotate on the $p$-"circle" of radius $\lambda$ and let $v':=(v_1,v_2)$ rotate on the $p$-"circle" of radius $\mu$. This means that we look at $$u'(s):=\lambda\bigl((1-s)^{1/p},s^{1/p}\bigr), \qquad v'(t):=\mu\bigl(t^{1/p},(1-t)^{1/p}\bigr)$$ for $0\leq s\leq1$, $\ 0\leq t\leq1$. In this way $(1)$ is fulfilled for all $s$ and $t$. We now have to look at the function $$D(s,t):=\left|\lambda(1-s)^{1/p}-\mu \>t^{1/p}\right|^p+\left|\lambda\>s^{1/p}-\mu (1-t)^{1/p}\right|^p$$ of two variables. Plotting it using Mathematica for various values of $\lambda$, $\mu$, and $p\geq1$ shows that it takes its maximum when $s=t=0$ or $s=t=1$, as conjectured. This would indicate that your index set $D$ is empty.

2
On

Your argument more or less goes through in general. The cases $p=1$ and $p=+\infty$ are special and may easily be done by hand. We assume here that $1<p<+\infty$. On the compact set $T\times T$ the maximum is attained for some couple $(u,v)$. We fix this extremal couple in the following. Observe that with your notation for the index sets: $$ 1=\|u\|_p^p = \sum_{i\in C}|u_i|^p + \sum_{i\in D} |u_i|^p $$ (and similarly for $v$) so that we may rearrange to get: $$ D(u,v)^p = \sum_{i\in C} u_i^p + \sum_{i\in B} v_i^p + \sum_{i\in D} |u_i-v_i|^p = 2 + \sum_{i\in D} \left(|u_i-v_i|^p - u_i^p -v_i^p\right) $$ For $a\geq b\geq 0$ clearly $|a-b|^p-a^p-b^p \leq -|b|^p$ and more generally for any $a,b\geq 0$ one has: $|a-b|^p-a^p-b^p \leq - \min\{a^p,b^p\}$. Therefore, $$ D(u,v)^p \leq 2 - \sum_{i\in D} \min\{u_i,v_i\}^p$$ which is strictly less than 2 unless $D$ was empty in the first place. QED

This method of proof is not restricted to finite dimensions and works in a general measure space.

EDIT: A solution using Lagrange multipliers: Suppose that the index set $D$ is non-empty. Now, $u_i$ and $v_i$ are both non-zero for $i\in D$ so we may use Lagrange multipliers to study the extremal value for this functional, the constraints being that $g(u)=\sum_{i\in C∪D} u_i^p −1$ and $h(v)=\sum_{i\in B∪D} v_i^p−1$ both should vanish. Writing $F(u,v,a,b)=D(u,v)p−ag(u)−bh(v)$ we see that at our extremal point, all partial derivatives of $F$ should vanish (we use here $p>1$ so that $D(u,v)$ is differentiable also when $u_i=v_i$ and only consider non-vanishing components of $u_i$ and $v_i$). In particular, taking the derivative w.r.t. $u_i$ with $i\in D$: $$p|u_i−v_i|^{p−1}{\rm sign}(u_i−v_i)−(1+a)p|u_i|^{p−1}=0 $$ This implies that ${\rm sign}(u_i−v_i)$ must be constant for $i\in D$ (we may assume $≥0$) and then that $v_i=t u_i$, $i\in D$ for some constant $0<t≤1$. Thus, $$D(u,v)^p=2+((1−t)^p−1−t^p)\sum_{i \in D}|u_i|^p$$ Finally, the t-dependent factor is strictly negative unless t=0, implying that the index set $D$ must have been empty in the first place. So your conclusion goes through and $\max_{T×T}D(u,v)=2^{1/p}$. A formula which is also valid for $p=1$ and $p=\infty$.