So there is a set representing training hours per day and number of won tournaments for 5 randomly chosen players.
$$ \begin{array}{|c|c|c|c|} \hline hours & 1 & 1 & 2 & 3 & 3\\ \hline won & 0 & 1 & 2 & 2 & 5\\ \hline \end{array} $$
The task is to fit linear regression model. Mainly: "State the model and estimate the regression coefficients"
Later on I have to find out how many wins we expect from a person who was training 5 hours per day.
I tried to solve it the following way
So we have a formula $y_i = \beta_0 + \beta_1x_i + \epsilon$
I tried getting the coefficients using the following formulas: $$\beta_1 = \frac{\sum(x_i-\bar{x})(y_i-\bar{y})}{\sum(x_i-\bar{x})^2}$$ $$\beta_0 = \bar{y} - \beta_1\bar{x}$$
As $\bar{x} = \bar{y} = 2$ I got $\beta_1 = \frac{3}{2}$ and $\beta_0 = -1$
Hence
$$\hat{y} = \frac{3}{2}x - 1$$
But I assume that it is no the correct way to solve when we face repeating values or I did forget about something that need to be also used here.
So should it be done other way?
Repeating points are just fine. It's equivalent to being able to weight a point. In this case, the weight is the number of times the point is repeated.