When solving for Ordinary Least Squares regression line I was taught that you solve the system of equations $$ y = X\beta \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;(1)$$ where $X$ is the design matrix. The way that was shown to solve for it is $$ X^Ty = X^TX\beta$$ $$ (X^TX)^{-1}X^Ty = \beta $$
But now that I am looking at it more closely, doesn't equation $(1)$ not have a solution in almost all cases, since there is most certainly not a line that passes through all the points?
So I am wondering how solving using the transpose and inverse leads to a solution for the minimization of $\epsilon$ when it seems there is no solution to the system.
I found the following question on Crossvalidated that answers the question https://stats.stackexchange.com/questions/309888/why-cant-we-cancel-these-two-matrices-in-the-ols-estimator
Here is the link to the following answer provided by @Aksakal https://stats.stackexchange.com/a/309895/236997
I remember asking myself almost exactly the same question 300 years ago upon seeing the regression equation $y=X\beta$ for the first time in my life. The difference was that I told myself: why don't we simply solve it as $\beta=X^{-1}y$? It turns out the answer is almost the same for both yours and my questions.
How does "cancelling out" work? In a simple algebra you get the following $$a\times b=a\times c$$ Then you multiply both sides by the same number, in this case it's $a^{-1}$: $$a^{-1}a\times b=a^{-1}a\times c$$ $$ b = c$$
The trouble is that neither $(X^T)^{-1}$ nor $X^{-1}$ exist when $n\ne k$. These are rectangular matrices as you can easily see, and actually allude to in your second question. There is no inversion of the rectangular matrices. I'll qualify the last statement later.
When $n=k$, you could do what I suggested in the beginning, i.e. $\beta=X^{-1}y$, because the least squares problem is not necessary anymore. It degenerates into a simple linear algebra equation with a unique solution. As noted in the comments, not even every square matrix has a solution. For instance, if you have a matrix with two identical rows there is no inverse for it.
Back to inverse of the rectangular matrix. You may have heard about the matrix pseudo inverse operation. You can apply it to invert the rectangular matrix, but there's no shortcut here: this will indeed solve the least squares equation, so you'll get back to the starting point :)