For data $x=\begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix}$ and $y=\begin{bmatrix} 4 \\ 5 \\ 3 \end{bmatrix}$, fit a linear model $y=ax+b+\varepsilon$ using the formula $\hat{\beta}=\left ( X^TX\right)^{-1}X^TY$.
This exercise should be really simple, but I have some doubts about my reasoning.
Let $Y_i = \beta_0 + \beta_1X_i+\varepsilon$ describe the i-th sample. The matrix of regressors is $X=\begin{bmatrix} 1 & 1 & 1 \\ 1 & 2 & 3\end{bmatrix}$.
Therefore $\left ( X^TX\right)^{-1} = \begin{bmatrix} 14 & 6\\ 6 & 3 \end{bmatrix}^{-1} = \begin{bmatrix} \frac{7}{3} & 1\\ 1 & \frac{1}{2}\end{bmatrix}$ and $X^TY=\begin{bmatrix}12\\ 23\end{bmatrix}$.
The product of the matrix multiplication is:
$\hat{\beta}=\left ( X^TX\right)^{-1}X^TY = \begin{bmatrix} \frac{7}{3} & 1\\ 1 & \frac{1}{2}\end{bmatrix} \cdot \begin{bmatrix}12\\ 23\end{bmatrix} = \begin{bmatrix}40\\ 18.5\end{bmatrix}$
Thus $\beta_0 = b = 40$ and $\beta_1 = a = 18.5$, meaning that the final linear model would be:
$y = 18.5x + 40 + \varepsilon$.
Hover, this seems unlikely, because graph of such line would be way above the given points. Am I making a mistake in my calculations? If so, where my understanding is incorrect?
Mistakes:
$$X^{\color{red}{T}}=\begin{bmatrix} 1 & 1 & 1 \\ 1 & 2 & 3\end{bmatrix}.$$
$$X^TX=\begin{bmatrix} 1 & 1 & 1 \\ 1 & 2 & 3\end{bmatrix}\begin{bmatrix} 1 & 1 \\ 1 & 2 \\ 1 & 3\end{bmatrix}=\begin{bmatrix} 3 & 6 \\ 6 & 14 \end{bmatrix}.$$
To compute inverse:
$$\begin{bmatrix} a & b \\ c & d\end{bmatrix}^{-1}=\frac{1}{ad-bc}\begin{bmatrix} d & -b \\ -c & a\end{bmatrix}$$