Finding best fitting universal weights for several weighted sums

75 Views Asked by At

I have several weighted sum equations, all of the following form:

  • $Sum1 = 0.6x + 0.4y$
  • $Sum2 = 0.5x + 0.5y$
  • $Sum3 = 0.2x + 0.8y$
  • $Sum4 = 0.7x + 0.3y$
  • $...$

I am searching for the best fitting pair of $x, y$ values, as no $x, y$ pair can solve all equations at once. The other values are known. It is also known that $x$ and $y$ are (should be) within the range $[0, 1]$.

So far I have tried some naive iterative approaches. Trying to close in on the optimal value by starting at an arbitrary pair, such as $0, 0$, and then altering the values in each step in a direction which reduces the total error. The problem is that I always seem to get stuck in a local optima, even when an accurate solution to the problem exists (on generated test data).

I am sure that the best fit values can be found reliably, but I don't know how to handle this problem best. Searching for best fits yielded lots of information for fitting various curve types to a set of data points, but that doesn't seem to trivially translate to this case, or at least I don't see how.

1

There are 1 best solutions below

1
On BEST ANSWER

This look very much like a linear regression problem in two variables with no intercept $$z=x \,a + y\, b$$ for a set of $n$ data points $(a_i,b_i,z_i)$ where $(x,y)$ are the coefficients to be dtermined.

Let us use just basic calculus and write each equation as $$s_i=a_i\, x+ b_i \,y$$ and consider the norm $$\Phi=\frac 12\sum_{i=1}^n(a_i\, x+ b_i \,y-s_i)^2$$ and compute the derivatives with respect to $x$ and $y$ $$\frac{\partial \Phi}{\partial x}=\sum_{i=1}^n a_i (a_i\, x+b_i\, y-s_i)$$ $$\frac{\partial \Phi}{\partial y}=\sum_{i=1}^n b_i (a_i\, x+b_i\, y-s_i)$$ Set them equal to $0$ and expand to get $$x \sum_{i=1}^n a_i^2+y\sum_{i=1}^n a_i b_i=\sum_{i=1}^n a_i s_i$$ $$x \sum_{i=1}^n a_ib_i+y\sum_{i=1}^n b_i^2=\sum_{i=1}^n b_i s_i$$ So, two linear equations to be solved for $(x,y)$.

If, in all your equations, $a_i+b_i=1$ but this does not change the problem except thet the calculations are simpler since $$\sum_{i=1}^n a_i b_i=\sum_{i=1}^n a_i -\sum_{i=1}^n a_i^2 \qquad \sum_{i=1}^n b_i^2=n-2\sum_{i=1}^n a_i+\sum_{i=1}^n a_i^2 \qquad \sum_{i=1}^n b_i s_i=\sum_{i=1}^n s_i-\sum_{i=1}^n a_i s_i$$

For illustration purposes, let us use the few cases you posted $$\left( \begin{array}{cccc} i & a_i & b_i & s_i \\ 1 & 0.6 & 0.4 & 3.0 \\ 2 & 0.5 & 0.5 & 2.9 \\ 3 & 0.2 & 0.8 & 2.8 \\ 4 & 0.7 & 0.3 & 3.0 \end{array} \right)$$ and let us work withe whole numbers. This leads to

$$\frac{57}{50}x+\frac{43}{50}y=\frac{591}{100}$$ $$\frac{43}{50}x+\frac{57}{50}y=\frac{579}{100}$$ the solutions of which being $$x=\frac{879}{280}\approx 3.13929 \qquad \text{and} \qquad y=\frac{759}{280}\approx 2.71071$$ and,to tell the truth, the numbers where generated using $$s_i=\frac{\text{Round}[10 (\pi\, a_i+e \,b_i)]}{10} $$