$L^1$ and $L^4$ norm relation to fixed $L^2$ norm

81 Views Asked by At

Hello math stack exchange!

Suppose I have some set of nonnegative values $x_1, x_2, \cdots, x_n$ and I have that the sum of their squares is a constant so $x_1^2+x_2^2+\cdots+x_n^2=\sum_{i=1}^n x_i^2 = K$ for some constant $K$.

I want to show that under this constraint, minimizing the fourth power of the $L^4$ norm, or the sum of the fourth powers of the values: $x_1^4+x_2^4+\cdots+x_n^4 = \sum_{i=1}^n x_i^4$ is equivalent to maximizing the $L^1$ norm or $\sum_{i=1}^n x_i$.

I don't need a solution w.r.t. any specific optimization procedure, it suffices to show that for two sets of values, if one has a greater sum of fourth powers, the other has a greater $L^1$ norm.

I anticipate some solution involving Cauchy-Schwarz, but any assistance would be much appreciated. Thank you!

Edit: So I ran some monte carlo simulations and I am unsure if this claim is true. My simulation is as follows: I draw two samples at random from an extremely high dimensional standard Gaussian, and then I apply absolute value to ensure positive entries. I then compare the sum of their squares and add another entry to the smaller one so that the sum of their squares is the same, and then I compare the fourth power of the $L^4$ norm with the $L^1$ norm. My winrate here, which is the portion of times my hypothesis is true, is around $89\%$ over many monte carlo trials. I am unsure, however, if this is a result of the construction of my specific procedure or if there is some more general statement to be made. Any clarification would be extremely valuable.

1

There are 1 best solutions below

0
On

I will interpret your question as the following: Let $K>0$ and let $X$ be the set of ordered $n$-tuples $(x_1, x_2, \dots, x_n)$ such that $\sum_{i=1}^n{x_i^2}=K$. Let A be the set of all $n$-tuples $x \in X$ such that for any $y \in X$, $\sum_{i=1}^n{x_i^4}\geq \sum_{i=1}^n{y_i^4}$, where $x = (x_1, x_2, \dots, x_n)$ and $y = (y_1, y_2, \dots, y_n)$. Similarly, let $B$ be the set of all $x \in X$ such that for any $y \in X$, $\sum_{i=1}^n{|x_i|} \leq \sum_{i=1}^n{|y_i|}$. Does $A$=$B$?

It does! We can prove this and a more general result using the method of Lagrange multipliers.

Suppose we have functions $f: \mathbb{R}^n \rightarrow \mathbb{R}$ and $g: \mathbb{R}^n \rightarrow \mathbb{R}$ and we want to find the maximum of $f(\mathbf{x})$ subject to the constraint $g(\mathbf{x})=0$, for $\mathbf{x} \in \mathbb{R}^n$. We can introduce a new variable $\lambda$ called the Lagrange multiplier and define the Lagrange function as follows: $$\mathcal{L}(\mathbf{x}, \lambda) = f(\mathbf{x}) + \lambda g(\mathbf{x}).$$ The important feature of the Lagrange function is that if $\mathbf{x} \in \mathbb{R}^n$ is a maximum or minimum of $f(\mathbf{x})$ with respect to the constraint $g(\mathbf{x})=0$, then there exists a $\lambda \in \mathbb{R}$ such that $$\nabla_{\mathbf{x}, \lambda}\mathcal{L} = 0 \text{, where }$$ $$\nabla_{\mathbf{x}, \lambda}\mathcal{L} = \left( \frac{\partial \mathcal{L}}{\partial x_1}, \dots, \frac{\partial \mathcal{L}}{\partial x_n}, \frac{\partial \mathcal{L}}{\partial \lambda} \right) .$$

In the first part of your problem we can consider the situation where we have $f(\mathbf{x}) = \sum_{i = 1}^n x_i^4$ and $g(\mathbf{x}) = \sum_{i=1}^n x_i^2 -K$. Therefore we have $\frac{\partial \mathcal{L}}{\partial \lambda} = g(\mathbf{x}) = 0$ and $\frac{\partial \mathcal{L}}{\partial x_i} = 4x_i^3-2\lambda x_i$. From this we can make three observations:

  • $\frac{\partial \mathcal{L}}{\partial x_i} \rvert_{x_i=0} = 0$ for all $\lambda \in \mathbb{R}$.
  • Suppose we have a solution $(\mathbf{x},\lambda)$ to $\nabla_{\mathbf{x}, \lambda}\mathcal{L} = 0$. Then for any two nonzero components $x_i$ and $x_j$ of $\mathbf{x}$, we have $x_i^2=x_j^2$.
  • Both $f(\mathbf{x})$ and $\nabla_{\mathbf{x}, \lambda}\mathcal{L}$ are invariant with respect to permutations on the components of $\mathbf{x}$, i.e., if $(x_1, \dots, x_n)$ is a maximum of $f(\mathbf{x})$ and $\sigma$ is some permutation on the set $\{1,2,\dots,n\}$, then $(x_{\sigma(1)}, x_{\sigma(2)}, \dots, x_{\sigma(n)})$ is also a maximum of $f(\mathbf{x})$.

Now suppose we have a solution $(\mathbf{x}, \lambda)$ to $\nabla_{\mathbf{x}, \lambda}\mathcal{L} = 0$ such that $\mathbf{x}$ has $n$ nonzero components. Since all nonzero entries are equal, $$g(\mathbf{x})=0 \implies \sum_{i=1}^n x_i^2 = K \implies nx_1^2=K \implies x_i = \pm \sqrt{\frac{K}{n}},$$ and so $f(\mathbf{x})=\frac{K^2}{n}$. By a similar calculation we can find that for any solution $(\mathbf{x}, \lambda)$ to $\nabla_{\mathbf{x}, \lambda}\mathcal{L} = 0$ such that $\mathbf{x}$ has $m$ nonzero components, $f(\mathbf{x}) = \frac{K^2}{m}$, for $1\leq m\leq n$. Thus we find that there are $2n$ distinct maxima of $f(\mathbf{x})$ of the form $(\pm \sqrt{K}, 0,0, \dots), (0,\pm \sqrt{K}, 0, \dots)$, etc., and $2^n$ distinct minima of the form $(\pm \sqrt{\frac{K}{n}}, \pm \sqrt{\frac{K}{n}}, \dots).$

If we were to go through this whole process again for the $L^1$ case, i.e., with $f(\mathbf{x}) = \sum_{i=1}^n |x_i|$ and $g(\mathbf{x}) = \sum_{i=1}^n x_i^2 -K$, we would find that for any $\mathbf{x} \in \mathbb{R}^n$ with $m$ nonzero components which solves $\nabla_{\mathbf{x}, \lambda}\mathcal{L} = 0$, $f(\mathbf{x}) = \sqrt{mK}$, for $1\leq m \leq n$. Similarly to before, this gives us $2n$ distinct minima of $f(\mathbf{x})$ of the form $(\pm \sqrt{K}, 0,0, \dots), (0,\pm \sqrt{K}, 0, \dots)$, etc., and $2^n$ distinct maxima of the form $(\pm \sqrt{\frac{K}{n}}, \pm \sqrt{\frac{K}{n}}, \dots).$

Further, if we let $f(\mathbf{X})=\sum_{i=1}^n x_i^n$ for any even $n>2$, we would find an identical relation between maxima and minima of $L^1$ and $L^n$ norms over a sphere.

I hope this conforms to the spirit of your question! (BTW, if you are just concerned with $n$-tuples with positive components, your solution is the subset of the Lagrange method solutions which have positive components. In this case the relation still holds.)