Multiple Regression over an experimental dataset

602 Views Asked by At

I want to do a multiple regression over an experimental result shown as 3D-Plot and heatmap in following Images. Sorry as a new user i am not allowed to post them directly but it is just a link to imgur!

3D Plot:

3D-Plot

Heatmap:

Heatmap

If I just investigate the images there is obviously a connection between the variables and the output.

At the Moment I get a very bad coefficient of determination ($R^2$), already trying several combinations.

Let say $x_1$ is the first variable and $x_2$ is the second. $Y$ is the results depending on $x_1$ and $x_2$. $x_1x_2$ is the product of $x_1$ and $x_2$. I also define the reciprocal for every value as $x_{1_{re}} , x_{2_{re}}$ and $\left(x_1 x_2\right)_{re}$ (for example $\frac{1}{x_1}$)

The best R-squared I can obtain is at $0.49$ with the following formula $y \approx x_{1_{re}} + x_{2_{re}} + \left(x_1 x_2\right)_{re}$

If I use a simple model like $y \approx x_1 + x_2 + x_1x_2$ it gets even worse to $0.32$

Somebody can help me out and point in the right direction!

Is the regression really so bad? Is there perhaps another formula I should try?

1

There are 1 best solutions below

1
On

Thanks for you help. This data is drawn from a experiment using a combination of Artifical Neural Networcs and Simulated Annealing. In Short Terms: 1. Generate specified number of data-sets from a given function (X,Y,Z) 2. Train a Multi-Layer-Perceptron 3. Do Simulated Annealing as presenting new data to the MLP to find an optimum

This data is the combination of: X = number of datasets Y = hidden neurons of the MLP Z = difference between the optimum obtained from SA and the real optimum

I just draw a Scatterplot for visualizing and I think you are right, the data is to much oscillating for a good $R^2$. But any ideas left to find an approximation?

https://i.stack.imgur.com/sLbxI.png