liner regression not using mean square error to estimate parameter

156 Views Asked by At

I am looking through some implementation of linear regression, I found it is not calculating parameter directly using formula like below,

enter image description here

but calculate Pearson product-moment correlation coefficient , then estimate parameter using Pearson's coefficient.

Here is an example, I think result is the same, but just confused why not using formula I posted above to calculate slope and intercept directly? What is the benefit if calculating Pearson's coefficient first?

http://code.activestate.com/recipes/578914-simple-linear-regression-with-pure-python/

def fit(X, Y):

    def mean(Xs):
        return sum(Xs) / len(Xs)
    m_X = mean(X)
    m_Y = mean(Y)

    def std(Xs, m):
        normalizer = len(Xs) - 1
        return math.sqrt(sum((pow(x - m, 2) for x in Xs)) / normalizer)
    # assert np.round(Series(X).std(), 6) == np.round(std(X, m_X), 6)

    def pearson_r(Xs, Ys):

        sum_xy = 0
        sum_sq_v_x = 0
        sum_sq_v_y = 0

        for (x, y) in zip(Xs, Ys):
            var_x = x - m_X
            var_y = y - m_Y
            sum_xy += var_x * var_y
            sum_sq_v_x += pow(var_x, 2)
            sum_sq_v_y += pow(var_y, 2)
        return sum_xy / math.sqrt(sum_sq_v_x * sum_sq_v_y)
    # assert np.round(Series(X).corr(Series(Y)), 6) == np.round(pearson_r(X, Y), 6)

    r = pearson_r(X, Y)

    b = r * (std(Y, m_Y) / std(X, m_X))
    A = m_Y - b * m_X

    def line(x):
        return b * x + A
    return line
1

There are 1 best solutions below

8
On BEST ANSWER

What is the benefit of computing the projection matrix $H=X(X'X)^{-1}X'$ and then take $H\beta=\hat{Y}$ instead of calculating the intercept and the slope using the OLS results?

In the simple linear model $y=\beta_0 + \beta_1x+\epsilon$ where $\epsilon \sim \mathcal{N}(0, \sigma^2) $. It doesn't matter whether you compute estimators using OLS, Pearson, MLE or Projection matrix. However, in multiple regression models, you can not use the Pearson correlation coefficient because you have more than two variables to correlate. If the noise term is not normal, then the Pearson coefficient (and the OLS) may by inappropriate. So, using $\hat{\beta}_1=r\frac{\sigma_{Y}}{\sigma_{X}}$ is approapriate and equivalent to the OLS result for the simple model with the aforementioned assumptions, which is only a special case of a linear parametric model.