How to work out the formula that connects several numbers

56 Views Asked by At

I have an interesting problem. Say I have lots of datasets like this:

a = 21
b = 23
c = 58
d = 498
etc (lots of other values)

X = 85

I need to find the formula that derives X from a, b, c, d etc, with the added complication that I don't know whether all of the values affect X or whether some have no effect on it. Is there a generic method to do that?

I do not have the ability to vary a, b, c and d and check the derived value of X; however, I have a huge amount of these datasets (combinations of values and the resulting X) to look at. I have some programming skills, so I am able to analyse all of these datasets using an algorithm, but I have literally no idea what that algorithm should be. Any help would be appreciated.

Note: I am new to this site, and don't know which tags to use, so feel free to retag this.

EDIT: Each dataset contains the same amount of numbers, and the positions are fixed, i.e. 'a' of one dataset corresponds to the 'a' in others.

1

There are 1 best solutions below

4
On BEST ANSWER

If you think there is a linear relationship between the $a, b, c$, etc., and $x$, then you could find the least-squares solution to the system of equations $\mathbf {Ay = X}$. The matrix $\mathbf A$ will consist of rows of the form $[a_i\ b_i\ c_i \ldots]$, and $\mathbf X$ is a column vector containing the values $x_i$. The vector $\mathbf y$ corresponds to the weights in your weighted average.

The system $\mathbf {Ay = X}$ does not necessarily have a solution, but you can find the "best fit" by multiplying both sides by $\mathbf A^t$ and solving the resulting system; i.e., $\mathbf {A}^t\mathbf{Ay} = \mathbf{A}^t\mathbf{X}$.

Thus the best-fit solution for your weights is $\mathbf{\hat y} = (\mathbf{A}^t\mathbf{A})^{-1}\mathbf{A}^t\mathbf{X}$.