I am having a question in regression analysis in JMP or any other tool.
I have one dependent variable $y$ and $2$ independent variables $x_1$ and $x_2$.
For example:
time $= y -$ per row time ( total time divided by total rows, $x_2$).
new rows added to $db = x_1$.
total rows in $db = x_2$
total time $= t$ for a database query
observed time $y$ (per row) $=$ total time $t /$ total rows $x_2$
$x_1$ is number of new rows added to database.
As you can see, data base rows ($x_2$) increase as new rows are added ($x_1$) . So per row time ($y$) decreases when number of rows added to $db$ are more.
sample data : $$ \begin{matrix} y & x_1 & x_2 \\ 0.000465116 & 0 & 86 \\ 0.000659091 & 1 & 44 \\ 0.000597561 & 2 & 82 \\ 0.000635294 & 2 & 85 \\ 0.00053271 & 2 & 107 \\ 0.000590909 & 2 & 110 \\ 0.0005 & 2 & 244 \\ 0.000577075 & 2 & 253 \\ 0.000685714 & 3 & 35 \\ 0.000947368 & 3 & 38 \\ 0.000717949 & 3 & 39 \\ 0.000755556 & 3 & 45 \\ 0.000574468 & 3 & 47 \\ 0.000716981 & 3 & 53 \end{matrix} $$
Can anyone suggest how I should approach this ?
I am getting $R^2$ as $50\%$ when I only try to model $y$ and $x_1 \to$ meaning $50\%$ of my data is correct using this linear fit.
Should I be modeling, per row time ($y$) with number of rows added ? Its a bit confusing I know. I would be happy to clear the details if anyone require more information.
If you speak about multilinear regression, start using $y=a+b x1+c x2$. Come back with your results and we could continue the discussion.
By the way, you have a very surprizing way for the interpretation of $R^2$