I have a set with 10.000 equations like this below:
1*A1 + 1*A2 + 0*A3 + 0*A4 + 1*A5... 1*A800 = 1
0*A1 + 1*A2 + 1*A3 + 0*A4 + 0*A5... 0*A800 = 0
0*A1 + 0*A2 + 0*A3 + 1*A4 + 1*A5... 0*A800 = 1
0*A1 + 1*A2 + 1*A3 + 0*A4 + 1*A5... 1*A800 = 0
I need to find a solution to all A1, A2...A800 constants BUT there will be never an exact solution because of how the data above was colected (they are like probabilities, also the number of equations will always be much larger than the number of constants, so the chance of an exact solution is pretty low), I need an aproximated solution which should be optimal in such a way that if I replace the solution of all constants back to the equation above, the error (expected result - actual result) is the minimum.
So how is the logic behind this solution? After knowing the logic, I will program it in C++ to solve the equations or, if there is a better way of doing this in Excel, that would be nice too. I am a Mechanical Engineer and Programmer, and even having some knowledge on solving liner equations, finding optiminal solutions and such, I cant find a way out of this problem!
NOTE: all A1, A2... A800 must be always positive!
The usual way to solve overdetermined system is by least-squares. If your system is
$$Ax=b,$$ you form the so-called normal equations,
$$A^TAx=A^Tb$$ which form a square system (in your case $800\times800$, dense). You will obtain a solution that is optimal in the least-squares sense (minimum sum of the squared residues).
Anyway, due to the special form of the coefficients, maybe some other approach is better.
Update:
The last minute note (positiveness constraint) makes it a quite different and much harder problem. https://en.wikipedia.org/wiki/Non-negative_least_squares