In his book "Introduction to Linear Algebra" (4th edition), Pr. Strang explains in chapter 4.2 (p. 210) that the best approximation of a vector $x$ when $$Ax=b$$
has no solution is $\hat x$ and is found by this formula:
$$A^TA\hat x=A^Tb$$
Assuming the columns of $A$ spans a subspace.
Later on, in chapter 4.4 (p. 233), introducing the Gram-Schmidt process, he explains that the orthogonal matrix $Q$ which columns are orthonormal and spans the same subspace as $A$ can reduce the equation to:
$$\hat x=Q^Tb$$
because $Q^TQ=I$. However, a little later p. 236, after explaining the $QR$ factorization, Pr Strang explains that as $A=QR$ then, $\hat x$ can be found by: $$\hat x=R^{-1}Q^Tb$$
Is it the same $\hat x$. If yes, why multiplying by $R^{-1}$ produces the same result?
You're right and the equation should be
$$R\hat x=Q^Tb$$
Here's a quick example to show they are not the same thing
$A=\begin{bmatrix}1 & 2 \\ 1 & 3 \\ 1 & 1\end{bmatrix}$
Then use Gram-Schmidt and normalize to get
$Q=\begin{bmatrix}1/\sqrt{3} & 0 \\ 1/\sqrt{3} & 1/\sqrt{2} \\ 1/\sqrt{3} & -1/\sqrt{2}\end{bmatrix}$
You can find $R=Q^TA=\begin{bmatrix}\sqrt{3} & 2\sqrt{3} \\ 0 &\sqrt{2}\end{bmatrix}$
Consider $b=\begin{bmatrix}3\\4\\0\end{bmatrix}$. $Ax=b$ is not consistent.
Note $Q^Tb=\begin{bmatrix}7/\sqrt{3}\\4/\sqrt{2}\end{bmatrix}$
Now hold on that and compute $A^TA=\begin{bmatrix}3 & 6 \\ 6 & 14\end{bmatrix}$
And $A^Tb=\begin{bmatrix}7 \\ 18 \end{bmatrix}$
$A^TA$ is invertible with $(A^TA)^{-1}=\begin{bmatrix}7/3 & -1 \\ -1 & 1/2\end{bmatrix}$
Multiply out $(A^TA)^{-1}A^Tb=\begin{bmatrix}-5/3 \\ 2 \end{bmatrix}\neq \begin{bmatrix}7/\sqrt{3}\\4/\sqrt{2}\end{bmatrix}=Q^Tb$
Note though if you compute $R^{-1}$ and multiply $R^{-1}Q^Tb=\begin{bmatrix}-5/3 \\ 2 \end{bmatrix}$
Alternatively the author could have said the following $$Q^TA\hat x=Q^Tb$$