It is known that any solution to the normal equations will solve the least squares minimization problem.
So finding a $w$ that satisfies $X^TXw=X^Ty$ will show that $w$ is optimal. However, it may be the case $X^TX$ is not invertible, so this would imply that the solution to the least squares may not be unique. However, we can use QR decomposition by setting $X=QR$ where $Q$ has orthogonal columns and $R$ is square and upper triangular with positive entries on the diagonal. This decomposition can always be done and is unique. Given $R$'s properties, it is always invertible. Plugging in this decomposition into the normal equations, after some simplifications we get $$Rw=Q^Ty$$ Since $R$ is invertible, we have $w=R^{-1}Q^Ty$ which I believe is unique. So would this not imply that the solution to the least squares gradient descent is always unique, regardless of the invertibility of $X^TX$?
The entries of $R$ can be chosen to be positive only if $X$ has full column rank. If $X$ does not have independent columns, at least one diagonal entry of $R$ will be zero. Thus $R$ might not be invertible.