How to solve by approximation linear equations

116 Views Asked by At

I have a set with 10.000 equations like this below:

1*A1 + 1*A2 + 0*A3 + 0*A4 + 1*A5... 1*A800 = 1
0*A1 + 1*A2 + 1*A3 + 0*A4 + 0*A5... 0*A800 = 0
0*A1 + 0*A2 + 0*A3 + 1*A4 + 1*A5... 0*A800 = 1
0*A1 + 1*A2 + 1*A3 + 0*A4 + 1*A5... 1*A800 = 0

I need to find a solution to all A1, A2...A800 constants BUT there will be never an exact solution because of how the data above was colected (they are like probabilities, also the number of equations will always be much larger than the number of constants, so the chance of an exact solution is pretty low), I need an aproximated solution which should be optimal in such a way that if I replace the solution of all constants back to the equation above, the error (expected result - actual result) is the minimum.

So how is the logic behind this solution? After knowing the logic, I will program it in C++ to solve the equations or, if there is a better way of doing this in Excel, that would be nice too. I am a Mechanical Engineer and Programmer, and even having some knowledge on solving liner equations, finding optiminal solutions and such, I cant find a way out of this problem!

NOTE: all A1, A2... A800 must be always positive!

2

There are 2 best solutions below

9
On

The usual way to solve overdetermined system is by least-squares. If your system is

$$Ax=b,$$ you form the so-called normal equations,

$$A^TAx=A^Tb$$ which form a square system (in your case $800\times800$, dense). You will obtain a solution that is optimal in the least-squares sense (minimum sum of the squared residues).

Anyway, due to the special form of the coefficients, maybe some other approach is better.


Update:

The last minute note (positiveness constraint) makes it a quite different and much harder problem. https://en.wikipedia.org/wiki/Non-negative_least_squares

0
On

After searching a lot about the subject, I came to the conclusion that regression linear would not solve this specific problem using OLS (at least not computationaly efficient), for example.

The best method I found is using iterative method and choose a method that converges. I found that the The Jacobi Method works pretty well for solving a set of linear equations. And a "plus" side of Jacoby Method is that you can "choose" how long you wanna run the algorithm according to how satisfied you are with the results. If you reach a point where the error of the tried values is getting enoughly small, then you can stop!

source: https://college.cengage.com/mathematics/larson/elementary_linear/5e/students/ch08-10/chap_10_2.pdf

For those who cant program, here is a pretty straighforward video showing how to do that using only excel -> https://www.youtube.com/watch?v=lW0lm4O3Ptw