What is the significance of having a matrix $A \in \mathbb{R}^{n\times p}$ such that $n = p$ and $\det(A) \neq 0$ in the context of solving for $\beta$ to minimize $$(y - A\beta)^T(y - A\beta)$$
Can't we technically still find $\beta$ if $n \neq p$?
Do the columns of A even need to be linearly independent to find $\beta$?
When the columns of $A$ are linearly dependent (regardless of $n=p$ or $n \ne p$), there will be infinitely many minimizers. In particular, if $\beta^*$ is a minimizer, then $\beta^* + z$ is also a minimizer for any vector $z$ in the nullspace of $A$ (since $A(\beta^* + z) = A\beta^*$).