Minimizing the sum of squared differences between integer variables and given values subject to a sum constraint

64 Views Asked by At

Let $a$ and $p$ be two sets of equal size. $p$ is a set of positive real numbers that sum up to an integer. The objective is to find a set of values $a$ that satisfy the given constraints:

minimize: $$\sum_i (a_{i} - p_{i}) ^ 2$$ subject to:

$$ \sum_i a_i = \sum_i p_i \\ a_{i} \in \mathbb N $$

I am new to the field of optimization so I would appreciate some guidance in approaching this problem.

1

There are 1 best solutions below

3
On

Given,

$$\min f(a)=\sum_i (a_{i} - p_{i}) ^ 2$$ $$ \text{Subject to: }\sum_i a_i = \sum_i p_i \\ a_{i} \in \mathbb N $$

We expand the objective function to become:

$$\min f(a) =\sum_i a_{i}^2 -2\sum_i a_ip_{i}+\sum_i p_i^2$$

The optimal function will reach an optimal solution at $\nabla f(a) = 0$, thus,

$$\nabla f(x) = \begin{pmatrix} 2a_1 - 2p_1 \\ \vdots \\ 2a_n - 2p_n \end{pmatrix}$$

Thus,

$$\begin{matrix} 2a_1 - 2p_1 = 0\\ \vdots \\ 2a_n - 2p_n = 0 \end{matrix} \rightarrow \begin{matrix} 2a_1 = 2p_1\\ \vdots \\ 2a_n = 2p_n \end{matrix} \rightarrow \begin{matrix} a_1 = p_1\\ \vdots \\ a_n = p_n \end{matrix}$$

Thus, the min will occur when $a_i = p_i$ for all $i$. Thus,

$$\min f(a=p)=\sum_i (p_{i} - p_{i}) ^ 2 = 0$$ $$ \text{Subject to: }\sum_i p_i = \sum_i p_i \\ $$

This result is expected, as this is this the sphere function, whose global min is at zero. But this solution is the "relaxed" solution, from here, we'll need to use a branch-and-bound technique to find integer solutions for the problem in respect to our decision variable $a$. For this problem, this would mean adding additional constraints for each individual $a$ until the sum of $a$ equals the sum of $p$.

For example, consider a relaxed solution of only two variables with $a_1 = 3.5$ and $a_2 = 3.5$, and the sum of $p$ is $7$.

From here we can test two types of constraints that $a_1 \le 3$ (the floor of its relaxed solution) or $a_1 \ge 4$ (the ceil of its relaxed solution). We test each added case until we have no more cases and take the highest feasible case.

So for example, the case of $a_1 \le 3$ would make $a_1=3$ and $a_2=4$, and the case of $a_1 \ge 4$ would make $a_1=4$ and $a_2=3$. Then we would calculate the $f(a)$ of each case and take the best one.