I usually encounter a version of Schwarz's theorem in which the continuity of all partial derivatives are required. I'm trying to prove Schwarz's theorem in a more general setting.
Could you please verify if my proof looks fine or contains logical gaps/errors? Thank you so much for your help!
Let $X$ be open in $\mathbb R^n$, $f:X \to F$, and $i, j \in\{1,\ldots,n\}$. Suppose that $\partial_j \partial_i f$ is continuous at $a$ and that $\partial_j f$ exists in a neighborhood of $a$. Then $\partial_i \partial_j f (a)$ exists and $$\partial_i \partial_j f (a) = \partial_j \partial_i f (a)$$
My attempt:
Let $\{e_1,\ldots, e_n\}$ be the standard basis of $\mathbb R^n$. Consider the maps $$\Psi: \mathbb R^{2} \to F, \quad (h, t) \mapsto f (a+ h e_i + t e_j) - f(a + h e_i) - f (a+t e_j) + f(a)$$ and $$\Phi: \mathbb R \to F, \quad s \mapsto f(a+s e_i + t e_j) - f (a + s e_i)$$
We have $\Psi(h,t) = \Phi(h) - \Phi(0)$. By Mean Value Theorem, we have $$\Phi(h) - \Phi(0) = \partial \Phi (\theta)h = (\partial_i f(a+ \theta e_i + t e_j) - \partial_i f (a +\theta e_i))h$$ for some $\theta$ between $0$ and $h$. Consider the map $$\Gamma:\mathbb R \to F, \quad s \mapsto \partial_i f(a+ \theta e_i + s e_j)$$
By Mean Value Theorem again, we have $$\partial_i f(a+ \theta e_i + t e_j) - \partial_i f (a +\theta e_i) = \Gamma(t) - \Gamma(0) = \partial \Gamma(\lambda)t = \partial_j \partial_i f(a+ \theta e_i + \lambda e_j)t$$ for some $\lambda$ between $0$ and $t$. As such, $\Psi(h,t) = \partial_j \partial_i f(a+ \theta e_i + \lambda e_j) ht$. For $ht \neq 0$, we have $$\partial_j \partial_i f(a+ \theta e_i + \lambda e_j) =\frac{\Psi(h,t)}{ht} = \frac{ f (a+ h e_i + t e_j) - f(a + h e_i) - f (a+t e_j) + f(a) }{ht}$$ Hence $$\begin{aligned} & \partial_j \partial_i f(a+ \theta e_i + \lambda e_j) -\partial_j \partial_i f(a) \\ = \quad& \frac{ f (a+ h e_i + t e_j) - f(a + h e_i) - f (a+t e_j) + f(a) }{ht} - \partial_j \partial_i f(a) \end{aligned}$$
It follows from the continuity of $\partial_j \partial_i f$ at $a$ that $$\forall \delta >0, \exists\epsilon >0, \forall (|h|+|t|<\epsilon): \|\partial_j \partial_i f(a+ \theta e_i + \lambda e_j) -\partial_j \partial_i f(a)\| < \delta$$
As such, for all $h,t \in \mathbb R \setminus\{0\}$ such that $|h|+|t|<\epsilon$, we have $$\left \| \frac{ f (a+ h e_i + t e_j) - f(a + h e_i) - f (a+t e_j) + f(a) }{ht} - \partial_j \partial_i f(a) \right \| <\ \delta$$
Taking the limit $t \to 0$, we obtain $$ \lim_{t \to 0}\left \| \frac{ f (a+ h e_i + t e_j) - f(a + h e_i) - f (a+t e_j) + f(a) }{ht} - \partial_j \partial_i f(a) \right \| \le\ \delta$$ and consequently $$ \left \|\lim_{t \to 0} \frac{ f (a+ h e_i + t e_j) - f(a + h e_i) - f (a+t e_j) + f(a) }{ht} - \partial_j \partial_i f(a) \right \| \le\ \delta$$ and consequently $$ \left \|\frac{1}{h}\lim_{t \to 0} \left( \frac{ f (a+ h e_i + t e_j) - f(a + h e_i) }{t} +\frac{ f (a+t e_j)- f(a)}{t} \right) - \partial_j \partial_i f(a) \right \| \le\ \delta$$ and consequently $$ \left \|\frac{1}{h} \lim_{t \to 0} \frac{ f (a+ h e_i + t e_j) - f(a + h e_i) }{t} + \frac{1}{h} \lim_{t \to 0}\frac{ f (a+t e_j)- f(a)}{t} - \partial_j \partial_i f(a) \right \| \le\ \delta$$
It follows from the existence of $\partial_j f$ in a neighborhood of $a$ that $$\lim_{t \to 0} \frac{ f (a+ h e_i + t e_j) - f(a + h e_i) }{t} = \partial_j f (a+he_i)$$ and $$ \lim_{t \to 0} \frac{ f (a+t e_j)- f(a)}{t} = \partial_j f (a)$$
As such, $$\left \|\frac{\partial_j f (a+he_i) - \partial_j f (a)}{h} - \partial_j \partial_i f(a) \right\| \le \delta$$
For all $\delta >0$, there is $\epsilon >0$ such that for all $|h| < \epsilon$, the last inequality holds. It follows that $$\lim_{h \to 0} \frac{\partial_j f (a+he_i) - \partial_j f (a)}{h} = \partial_j \partial_i f(a)$$
Hence $$\partial_i \partial_j f(a) = \partial_j \partial_i f(a)$$