What is the best way to determine the relationship for three apparently related variables? The relationship does not appear to be linear, and may follow a combination of non-linear functions.
I have the following data points:
x y z
1 0.5 0.01
1 1 0.01
1 2 0.01
1 10 0.01
1.3 0.5 0.015
1.3 1 0.0177
1.3 2 0.023
1.3 10 0.066
1.5 0.5 0.018
1.5 1 0.0223
1.5 2 0.031
1.5 10 0.1
Assume z is the output, and x and y is the input, and no variable can be 0.
- Given these sample data points, how can I predict z given x and y?
- Is there a mathematical relationship between the variables?
- How can I find an equation that relates these variables?


I began by making a matrix of scatterplots of each variable against the other two. There is some association between z and each of the predictor variables. but apparently not between x and y. Correlations are as shown below.
The patterns of association in the scatterplots suggested it might be better to use $\log_e z$ than $z.$ Correlations are as follows. (The second number in each cell is a P-value, assuming normal data.)
Based on this information I did a regression of log z on x and y.
Data Display
$ $ General Regression Analysis: logz versus x, y
Based on this printout it seems that $\log(x)$ can be predicted by the equation $\log(z) = -7.37403 + 2.46466 x + 0.105768 y.$ The P-values indicate that the constant term $-7.37403$ is significantly different from 0, and coefficients of $x$ and $y$ are significantly different from 1, all at the 1% level of significance. The diagnostics indicate that the fourth row of your data fit the regression model poorly (but I would say not so poorly that row of data should be ignored).
I also did a regression of z on x and y, obtaining the regression equation $z = -0.0714 + 0.0658421 x + 0.00466667 y$. However, all indications are that this is not quite as successful a fit as the one shown above.
The scatterplots of variables x, y, and log z are shown below. Associations are possibly useful for prediction, but (even with the log transformation of z) they are are only roughly linear.
So there is a useful relationship among the three variables making it possible to predict log z reasonably well from x and y. It is possible that some transformation of z other than taking logs would be better. It is also possible, especially with better knowledge of how the data were collected, that more useful predictions could be obtained by eliminating the fourth row of data from the regression. I will leave it to you to explore such variations.
If you are not familiar with linear regression using two predictor variables, you should read in your textbook about the methods and the cautions in interpretation. Computations and the graph are from Minitab statistical software, but other software should give essentially the same results.