correlation coefficient in linear regression unexplainable

48 Views Asked by At

I wanted to compare two datasets and show their regression graph and the correlation. I calculated everything and wantend to show the data in excel but got confused, because the regression with in my eyes the better regressiongraph (and therefore also the better correlation?) has a way lower correlation than the other dataset.

Here the graphical presentation and the correlation and the calculated coefficient:

correlation coefficient of two datasets

I expected such low correlations only with datasets like the following:

low correlations

I've googled around but didn't found any explanations about this, can someone of you guys?

If you want, i also can provide the datasets used here. Thank you very much in advance.

1

There are 1 best solutions below

1
On BEST ANSWER

The way to make sense of this is look at the correlation formula. In your eyes it can look great but if for example your units are very small then the distance of your lines can also be very small even though the variables are not correlated at all. You should write the correlation formula (you didn’t write what you used) and then you compute each of its elements to really understand the correlation. You may then see that one variable has a high variance and that what effects the correlation for example. Your sample size also looks small so beware that the results may not have statistical significance.