I am trying to write out an alternative proof of the Morse Lemma suggested but not completely written in Elementary Classical Analysis by Marsden and Hoffman.
This is my attempt. Without loss of generality, assume $x_0 = 0$ and $f(x_0) = f(0) = 0$. Use Taylor's theorem to write \begin{align*} f(x) &= f(0) + Df_0(x) + \frac12 D^2f_0 (x,x) + \frac12R_x (x,x) \\ &= \frac12 D^2f_0 (x,x) + \frac12R_x (x,x) \\ &= \frac12\langle A_xx,x\rangle \end{align*} such that for each $x$, $A_x = D^2f_0 + R_x$ is a symmetric linear transformation of $\mathbb{R}^n$. Since $D^2f_0$ is non-singular, $A_0$ is an isomorphism, and so $A_x$ is an isomorphism if $x$ is near $0$. Let $Q_x = A_0A_x^{-1}$, so $Q_0 = I$ and $Q_x^T = (A_x^{-1})^TA_0^T = A_x^{-1}A_0$ since $A_x$ is symmetric for all $x$. Define the square root $T_x$ of $Q_x$ for $x$ close to $0$ by the power series $T_x = I - \sum_{n=1}^\infty | \binom{1/2}{n} | (I-Q_x)^n$, so its transpose $T_x^T = I - \sum_{n=1}^\infty | \binom{1/2}{n} | (I-Q_x^T)^n$. Note that $$ Q_xA_x = A_0A_x^{-1}A_x = A_0 = A_xA_x^{-1}A_0 = A_xQ_x^T $$ So $$ Q_x^nA_x = Q_x^{n-1}A_xQ_x^T = \cdots = A_x(Q_x^T)^n $$ It follows that \begin{align*} T_xA_x &= A_x - \sum_{n=1}^\infty \biggr |\binom{\frac12}{n} \biggr | (I-Q_x)^nA_x \\ &= A_x - \sum_{n=1}^\infty \biggr |\binom{\frac12}{n} \biggr | (A_x - nQ_xA_x + \cdots + (-1)^nQ_x^nA_x) \\ &= A_x - \sum_{n=1}^\infty \biggr |\binom{\frac12}{n} \biggr | (A_x - nA_xQ_x^T + \cdots + (-1)^nA_x(Q_x^T)^n) \\ &= A_x - \sum_{n=1}^\infty \biggr |\binom{\frac12}{n} \biggr | A_x(I-Q_x^T)^n \\ &= A_xT_x^T \end{align*} Let $S_x = T_x^{-1}$, so $A_x = T_xA_x(T_x^T)^{-1} \Rightarrow T_xA_x = Q_xA_xS_x^T = A_0S_x^T \Rightarrow A_x = S_xA_0S_x^T$. Let $h(x) = S_x^Tx$, so \begin{align*} f(x) &= \frac12\langle A_xx,x\rangle \\ &= \frac12\langle (S_x^T)^TA_0S_x^T,x\rangle \\ &= \frac12 \langle A_0S_x^Tx, S_x^Tx \rangle \\ &= \frac12 \langle A_0h(x), h(x) \rangle \end{align*} and $Dh_0 = (DS_x^T\cdot x + S_x^T)_0 = S_0^T= I$, which is invertible. Hence, by the inverse fuction theorem, $h$ is locally invertible. Let $g = h^{-1}$, so $h \circ g(x) = x$. It follows that $f \circ g(x) = \frac12\langle A_0h \circ g(x), h \circ g(x)\rangle = \frac12\langle A_0x, x \rangle$. Now, since $\frac12D^2f_0$ is an invertible symmetric $n \times n$ matrix, we can use a linear change of coordinates such that $(\frac12D^2f_0)_{ij} = \pm \delta_{ij}$. So $f \circ g(y) = \pm y_1^2 \pm \cdots \pm y_n^2$.
Is my proof mathematically correct and understandable? My math professor would always comment on my proof saying that my notations are sloppy. Also, my linear algebra is terrible, and I'm concerned that I might mess things up somewhere. I am just wondering if this time I got everything correct. Thanks!