Minimizing Euclidean distance between target and sum of selected vectors

123 Views Asked by At

I have multiple category sets of food $A,B,...$, each with nutritional information encoded in a vector $x_{a_1},x_{a_2},..., x_{b_1},x_{b_2},...$. I also have a vector, $y$, which encodes the target nutrition values. I want to select a combination of foods from the food sets such that the difference between the target nutrition $y$ and the sum of the selected food nutrition vectors is minimized:

$$min_x\text{ }||x-y||_2$$

where $x$ is the element-wise sum of all selected foods $$x = \sum_i{x_{i_j}}$$ for $i \in \{A,B,C,...\}$ and $j \in \{1,2,3,...\}$. The simplest, most computationally expensive way to do this is to calculate the Euclidean distance for every possible combination of food items and take the minimum, but this intractable for large sets $A,B,...$

I was considering using linear programming but I was unable to come up with any way to formulate this coherently. What I have tried is the following:

Let $Z$ be a $m \times n$ matrix for $n$ foods and $m$ nutrients. Then we can write

\begin{equation} \label{eq1} \begin{split} min \quad & \text{ }||(diag(Zx)Z^T)\textbf{1} - y||_2 \\ s.t. \quad & \\ & x \geq 0 \end{split} \end{equation}

where for a given $x$, $Zx$ gives the amount of each food, multiplied element-wise against $Z^T$ and a column vector of 1's to get the nutritional vector for the selected foods in $x$ amount that can be compared with the target $y$. Currently I have $x \geq 0$ but I also want to add a constraint that forces at least one selection from a given food set.

Am I on the right track? This seems needlessly complicated. Even if it is correct, how would I go about solving this practically? Is there any way to do this with branch and bound? Any help here would be appreciated

1

There are 1 best solutions below

2
On BEST ANSWER

Here are two linear approaches that consider the $L_1$ and $L_\infty$ norms instead of $L_2$. Let binary decision variable $w_j$ indicate whether food $j$ is selected. If I understand your notation correctly, the amount of nutrient $i$ selected is then $\sum_j Z_{ij} w_j$.

You can minimize the $L_1$ norm $$\sum_i \left|\sum_j Z_{ij} w_j - y_i\right|$$ linearly by introducing nonnegative surplus $s_i$ and slack variables $t_i$ and minimizing $$\sum_i (s_i + t_i)$$ subject to linear constraints $$\sum_j Z_{ij} w_j - s_i + t_i = y_i \quad \text{for all $i$}.$$

Alternatively, you can minimize the $L_\infty$ norm $$\max_i \left|\sum_j Z_{ij} w_j - y_i\right|$$ linearly by introducing nonnegative surplus $s_i$ and slack variables $t_i$ and minimizing $u$ subject to linear constraints \begin{align} \sum_j Z_{ij} w_j - s_i + t_i &= y_i && \text{for all $i$} \\ s_i + t_i &\le u && \text{for all $i$} \end{align}