What is the truly TRUE way to denote regression equation?

52 Views Asked by At

I have seen a lot of books where statistics is not the focus denote the regression equation in the "proper" way, and I want to clear things up to ensure my understanding. Is it true that:

  1. The predicted regression equation (i.e. the one that software spits out; the best guessed line of fit) is denoted $\hat{y} = \hat{b_1}x +\hat{b_0}$.
  2. The actual, as-seen-in-the-collected-data value is $y$, and thus the absolute true regression equation that we would arrive at if we were God or something would be $y = b_1 x + b_0$.
  3. In order to arrive at $y$ using our prediction line, we add some value $\varepsilon$ which is not constant because it depends on which prediction you're adjusting. Generally, to arrive at one of your true data points then $y = \hat{b_1} x + \hat{b_0} + \varepsilon$. I'm thinking about all those dots that may be above or below the best fit line: at any point on that best fit line, we need to move vertically up or down to reach our actual number. Sometimes $\varepsilon$ is zero because our best fit line passes right through $y$.
  4. There is no difference between $\varepsilon$ and $\hat{\varepsilon}$ in statement 3. You are not attempting to predict the error (it arrives from an attempt to predict $y$ in the first place), so technically $\varepsilon$ is most correct.