Find the minimizer of $w_1 \|x - a\|_1 + w_2 \|x - b\|_2$

352 Views Asked by At

Find $$\arg\min_{x \in \mathbb{R}^n} w_1\|x - a\|_1 + w_2\|x - b\|_2$$

I'm trying to evaluate

$$\hat{x} := \arg\min_{x \in \mathbb{R}^n} w_1\|x - a\|_1 + w_2\|x - b\|_2 \tag{1}$$

to find a closed form, or at least a simpler expression, in terms of $w_1 > 0, w_2 > 0, a \in \mathbb{R}^n$, and $b \in \mathbb{R}^n$. While this is vague, I'm hoping to have an expression that is reasonably interpretable.

Solution attempt I

In my first attempt at finding a closed form, I took the subgradient, which may find a candidate minimizer. From the first order condition, we know that $\hat{x}$ satisfies that the subgradient of the objective in (1) is zero; that is, $$0 = w_1 \hat{z} + w_2 \frac{\hat{x}-b}{\|\hat{x}-b\|_2},$$ where

$$\hat{z} \in \begin{cases} \{\mathrm{sgn}(\hat{x} - a)\} & \mathrm{ if } \hat{x} \ne a \\ [-1,1] & \mathrm{ if } \hat{x} = a \end{cases}$$

While this is a characterization of $\hat{x}$, it isn't clear to me how to proceed to find a closed form.

Solution attempt II

As a second attempt, I tried to use duality to find a simpler expression.

By introducing $y = x-a$, we can see that this problem is equivalent to $$\hat{y} = \arg\min_{y \in \mathbb{R}^n} w \|y\|_1 + \|y - (b-a)\|_2,$$ where $w = \frac{w_1}{w_2}$. By the Lagrangian duality, we know that there exists some $C \in \mathbb{R}$ so that \begin{align} \hat{y} & = \arg\min_{\|y\|_1 \le C} \|y - (b-a)\|_2 \\ & = \arg\min_{\|y\|_1 \le C} \|y - (b-a)\|_2^2 \\ & = \arg\min_{y} \|y - (b-a)\|_2^2 + \lambda \|y\|_1 \\ & = \mathrm{sgn}(b-a) \left( |b-a| - \lambda \right)_+, \end{align} is just soft thresholding, for some dual variable $\lambda$ that depends on $C$ and $b-a$ in some way that I don't understand. This appears to be a closed form, but, since $\lambda$ is not a closed form function of $w_1, w_2, a$ and $b$, this doesn't satisfy what I'm looking for.

2

There are 2 best solutions below

5
On

I'll solve it first for the case of $\mathbb{R}^1$. Then I’ll use that intuition to solve it generally.

Write your objective function as $f(x)$. It is real valued, and it is smooth everywhere but $a=x$ and $x=b$. I'll use the notation that $\hat{x}$ is a minimizer.

If $a=b$, then the trivial solution is $\hat{x}=a=b$. So assume $a \ne b$. I am also going to assume that $w_1 >0$ and $w_2 > 0$; otherwise the solution is again trivial.

In this easy case, your objective function is simply $$ f(x)=w_1|x-a|+ w_2\sqrt{(b-x)^2} $$ This function is minimized by setting $\hat{x}=a$ if $w_1 \ge w_2$ and $\hat{x}=b$ otherwise. (If $w_2=w_1$, then any $x$ between $a$ and $b$ is a minimizer.)

I will now try to extend this solution to $\mathbb{R}^n$.

The objective function $f(x)$ is continuous, and any bounded $X \subset \mathbb{R}^n$ that contains $a$ and $b$ is compact. So $f(x)$ attains a minimum on $X$. We know that the minimizer $\hat{x}$ will satisfy $$ f(\hat{x}) \le min\{f(a),f(b) \}= min \{w_1 \parallel a- b \parallel_1,w_2\parallel a- b \parallel_2 \} $$ Write $I_i=[min\{a_i,b_i\} , max\{a_i,b_i\}]$. Notice that $\hat{x}_i \in I_i$ because otherwise the objective function could be made lower by moving $\hat{x}_i$ into that closed interval. Now put $X=I_1 \times \cdots \times I_n$. From now on, we will restrict $f(.)$ to this compact domain. Let $U \subset \mathbb{R}^n$ be an open set containing $X$. Note further that $f(x)$ is smooth everywhere in $U$, except at $x_i = a_i$ or $x = b$. So $f(.)$ is smooth, even on the boundary of $X$, except at those values.

The first order necessary condition with respect to $x_i$ is $$ \begin{array}{c} \partial f / \partial x_i = \pm w_1 - w_2 \dfrac{b_i - \hat{x}_i}{\parallel b-\hat{x} \parallel_2}= 0 \\ \pm w_1 = w_2 \dfrac{b_i - \hat{x}_i}{\parallel b- \hat{x} \parallel_2}. \end{array} $$ since $w_1 > 0$ and $w_2 > 0$, none of the $\hat{x}_i=b_i$ unless $\hat{x}=b$. Summing the squares, we derive: $$ n w_1^2 = w_2^2 \sum_i \dfrac{(b_i - \hat{x}_i)^2}{\parallel b-\hat{x} \parallel_2^2} \\ n w_1^2=w_2^2 $$ which cannot be true in general since $w_1$ and $w_2$ are arbitrary. If it were true, then any $\hat{x}$ in the interior of $X$ would be a minimizer.

We have shown that $\hat{x_i}=a_i$ for some $i$ or $\hat{x}=b$, except for the special case when $nw_1^2=w_2^2$.

If $\hat{x}_i=a_i$, then moving away from $a_i$ must not decrease the objective function. Thus: $$ w_1^2 \ge w_2^2 \dfrac{(b_i-a_i)^2}{\parallel b-\hat{x} \parallel^2_2}, $$ with equality for the indices for which $\hat{x}_j \ne a_j$. Squaring and summing again, we derive: $$ nw_1^2 \ge w_2^2 $$ Hence, if $w_1$ is sufficiently large, then there is some $x_i=a_i$. This is the appropriate generalization of the condition $w_1 \ge w_2$ for the case where $n=1$.

We can say more. Let $$ \begin{array}{c} m_i=\dfrac{(b_i-a_i)^2}{\parallel b- a \parallel_2^2} \\ \end{array} $$ and notice that $$ \begin{array}{c} m_i < \dfrac{(b_i-a_i)^2}{\parallel b - \hat{x} \parallel_2^2} \\ \end{array} $$ for any $\hat{x} \ne a$. Then write $m = min \{m_1, ..., m_n\}$ and $M=max \{m_1, ..., m_n\}$. If $m_i \le m_k$ and $\hat{x}_k =a_k$, then we know that $\hat{x}_i=a_i$ too. So if $w_1 \ge M w_2$ then $\hat{x}=a$. And if $w_1 < m w_2$, then $\hat{x}_i \ne a_i$ for any $i$.

If $w_1 < m w_2$, then the least squares formula gives $\hat{x}$. Let $y=sign(a_i-b_i)w_1/w_2$. Then $$ \hat{x} = b(b^Tb)^{-1}b^Ty $$

Finally, if $\hat{x}=b$, then moving away from any $b_i$ must not decrease the objective function. We need only consider $x_i=a_i$. Now we cannot use the first-order condition. Hence $$ w_1^2 \le w_2^2 (b_i-a_i)^2 $$ for every $i$. Summing again, we see: $$ nw_1^2 \le w_2^2 \parallel b-a \parallel_2^2 $$ So for low penalty $w_1$, we put $\hat{x} = b$

0
On

Given vectors $\mathrm a, \mathrm b \in \mathbb R^n$ and weights $w_1, w_2 > 0$

$$\text{minimize} \quad w_1 \| \mathrm x - \mathrm a \|_1 + w_2 \| \mathrm x - \mathrm b \|_2$$

Introducing optimization variables $\mathrm y \in \mathbb R^n$ and $z \in \mathbb R$, I believe the unconstrained optimization problem above can be rewritten as the following constrained optimization problem

$$\begin{array}{ll} \text{minimize} & w_1 \, 1_n^\top \mathrm y + w_2 \, z\\ \text{subject to} & -\mathrm y \leq \mathrm x - \mathrm a \leq \mathrm y\\ & \| \mathrm x - \mathrm b \|_2 \leq z\end{array}$$

Squaring both sides of $\| \mathrm x - \mathrm b \|_2 \leq z$, we obtain

$$\left( \mathrm x - \mathrm b \right)^\top \left( \mathrm x - \mathrm b \right) \leq z^2$$

Dividing both sides by $z > 0$, we obtain

$$\left( \mathrm x - \mathrm b \right)^\top \left( z \mathrm I_n \right)^{-1} \left( \mathrm x - \mathrm b \right) \leq z$$

Using the Schur complement, we obtain the following linear matrix inequality (LMI)

$$\begin{bmatrix} z \mathrm I_n & \left( \mathrm x - \mathrm b \right)\\ \left( \mathrm x - \mathrm b \right)^\top & z\end{bmatrix} \succeq \mathrm O_{n+1}$$

Thus, we have a semidefinite program (SDP) in $\mathrm x, \mathrm y \in \mathbb R^n$ and $z \in \mathbb R$

$$\boxed{\begin{array}{ll} \text{minimize} & w_1 \, 1_n^\top \mathrm y + w_2 \, z\\ \text{subject to} & -\mathrm y \leq \mathrm x - \mathrm a \leq \mathrm y\\ & \begin{bmatrix} z \mathrm I_n & \left( \mathrm x - \mathrm b \right)\\ \left( \mathrm x - \mathrm b \right)^\top & z\end{bmatrix} \succeq \mathrm O_{n+1}\end{array}}$$