I'm a computer programmer trying to solve a particular toy problem, and my understanding of linear algebra is far too lacking to solve it!
I have a data set that can be modeled using this function:
f1(w,x,y,z)=(Aw+Bx+C)(Dy+Ez+F)
(For some constants A, B, C, D, E, and F.)
I expanded the function as follows:
f2(w,x,y,z)=ADwy+AEwz+AFw+BDxy+BExz+BFx+CDy+CEz+CF
And then rewrote this function as the equivalent:
f3(a,b,c,d,e,f,g,h)=ADa+AEb+AFc+BDd+BEe+BFf+CDg+CEh+CF
(This is with a=wy, b=wz, etc.)
And then, since that's a linear function, found a best fit for the constants AD, AE, AF, BD, BE, BF, CD, CE, and CF using a linear least-squares regression. (To do this, I simply converted my input data columns as necessary to convert from f1 to f3. In case it's relevant, I used Python's numpy.linalg.lstsq to do so, and the fit is well within the tolerance levels I'm interested in.)
Given that, is it possible to factor the result back out in order to find the constants A, B, C, D, E, and F? If so, how do I do so? If not, why?
Thanks!
EDIT: provided additional explanation to show how I converted the problem to a linear one. As I said, the first step of the problem already works; I'm trying to move to the second step!
Many thanks to everyone (and in particular, Respawned Fluff) for the hints. I'll answer my own question in full for posterity.
It seems what I'm doing is very similar to a polynomial regression. The technique violates some of the assumptions of a linear regression (particularly independence of the explanatory variables), but it nonetheless works in many situations. (Though one should verify the results, just to be safe, since the high correlation between variables might lead to degeneracy. In my case, I did so through graphing the results and visually inspecting them to be sane.) The downside, unfortunately, is that it is difficult to interpret the resulting constants and the resulting best-fit equation might be better considered a "black box."
More to the point of my question, it is not possible in general to factor my polynomial, since it depends on the exact values of the constants obtained. It is evidently possible to use singular value decomposition (in NumPy,
numpy.linalg.svd) to "approximately factor" the polynomial, but doing so will most likely result in a worsened fit.The preferred approach for this is, if I want to fit an equation in a particular way, to simply use a nonlinear regression method. (NumPy doesn't include the tools for this, but SciPy does.) (My intuition suggests that nonlinear methods are more work than the linear method I used--since one needs to make initial guesses--but it can provide a much easier framework with which to intuitively understand the data, since the result will be in whatever form you want.)