Define the closed span $H_n:=\overline{\text{sp}}\left\{X_1,\ldots,X_n\right\}$ and let $\hat{X}_{n+1}=P_{H_n}X_{n+1}$ denote the orthogonal projection of $X_{n+1}$ onto $H_n$.
I read that, defining $\hat{X}_1:=0$, we then have $$ H_n=\overline{\text{sp}}\left\{X_1-\hat{X}_1,X_2-\hat{X}_2,\ldots,X_n-\hat{X}_n\right\}. $$
Why does this hold?
I guess this has something to do with the Gram-Schmidt procedure.
Due to the linked article:
We set $u_1:=X_1-\hat{X}_1=X_1$.
Then, we set $u_2=X_2-P_{\overline{\text{sp}}\left\{X_1\right\}}X_2$ which is by definition $X_2-\hat{X}_2$.
Up to here, I do understand.
Then, we set $$ u_3=X_3-P_{\overline{\text{sp}}\left\{X_1\right\}}X_3-P_{\overline{\text{sp}}\left\{u_2\right\}}X_3=X_3-P_{\overline{\text{sp}}\left\{X_1\right\}}X_3-P_{\overline{\text{sp}}\left\{X_2-\hat{X}_2\right\}}X_3. $$
So my question is, whether $$ P_{\overline{\text{sp}}\left\{X_1\right\}}X_3+P_{\overline{\text{sp}}\left\{X_2-\hat{X}_2\right\}}X_3 = \hat{X}_3 $$
So lets start with the definition of the linear hull. The linear hull $sp(S)$ of a Set S (in your case the set $S=\{X_{1},...,X_{n}\}$ is the set of all finite linear combinations of $S$.
Now basically you have to show $sp(\{X_{1},...,X_{n}\})=sp(\{X_{1}-\hat{X}_{1},...,X_{n}-\hat{X}_{n}\})$
Without giving a complete proof ill try to argue that this holds, i hope you can fill in the blanks on you own. By the way you are right the right hand side is exactly the span of the basis vectors after the application of Gram-Schmidt.
Now first convince youself that the projection operators $\hat{X}_{n}:=P_{n-1}X_{n}$ are only linear combinations of $X_{n}$. You can look it up in the Gram Schmidt algorithm, it should really only involve subtracting the $X_{i}$ for $i < n$ and some divisions by the norm for normalzation. If $\hat{X}_{n}$ is a linear combination of $\{X_{1},...,X_{n}\}$, then clearly also $X_{n}-\hat{X}_{n}$, is still only linear combination of $\{X_{1},...,X_{n}\}$.
Thus $\{X_{1}-\hat{X}_{1},...,X_{n}-\hat{X}_{n}\} \subset sp(\{X_{1},...,X_{n}\})$. Any linear combination of $\{X_{1}-\hat{X}_{1},...,X_{n}-\hat{X}_{n}\}$ is by definition again in the span finally showing $sp(\{X_{1}-\hat{X}_{1},...,X_{n}-\hat{X}_{n}\}) \subset sp(\{X_{1},...,X_{n}\})$.
I think you should be able to proof the other inclusion pretty analougosly.
The intuitive idea is simply that the original Set and the Gram Schmidt normailzed Set are related by some linear combinations. Since the span consists of ALL linear combinations, it is left invariant by taking linear combinations of the vectors $S$ that define the span.
I hope this helps!