help in multiple linear regression

77 Views Asked by At

I am having a question in regression analysis in JMP or any other tool.

I have one dependent variable $y$ and $2$ independent variables $x_1$ and $x_2$.

For example:

time $= y -$ per row time ( total time divided by total rows, $x_2$).

new rows added to $db = x_1$.

total rows in $db = x_2$

total time $= t$ for a database query

observed time $y$ (per row) $=$ total time $t /$ total rows $x_2$

$x_1$ is number of new rows added to database.

As you can see, data base rows ($x_2$) increase as new rows are added ($x_1$) . So per row time ($y$) decreases when number of rows added to $db$ are more.

sample data : $$ \begin{matrix} y & x_1 & x_2 \\ 0.000465116 & 0 & 86 \\ 0.000659091 & 1 & 44 \\ 0.000597561 & 2 & 82 \\ 0.000635294 & 2 & 85 \\ 0.00053271 & 2 & 107 \\ 0.000590909 & 2 & 110 \\ 0.0005 & 2 & 244 \\ 0.000577075 & 2 & 253 \\ 0.000685714 & 3 & 35 \\ 0.000947368 & 3 & 38 \\ 0.000717949 & 3 & 39 \\ 0.000755556 & 3 & 45 \\ 0.000574468 & 3 & 47 \\ 0.000716981 & 3 & 53 \end{matrix} $$

Can anyone suggest how I should approach this ?

I am getting $R^2$ as $50\%$ when I only try to model $y$ and $x_1 \to$ meaning $50\%$ of my data is correct using this linear fit.

Should I be modeling, per row time ($y$) with number of rows added ? Its a bit confusing I know. I would be happy to clear the details if anyone require more information.

1

There are 1 best solutions below

6
On

If you speak about multilinear regression, start using $y=a+b x1+c x2$. Come back with your results and we could continue the discussion.

By the way, you have a very surprizing way for the interpretation of $R^2$