Generalize Clairaut-Schwarz theorem to arbitrary order of mixed partial derivatives

641 Views Asked by At

After reading the answer here to understand how to apply difference operator, I've figured out how to generalize my proof of Clairaut-Schwarz theorem here to arbitrary order of mixed partial derivatives.

Could you please verify if my proof looks fine or contains logical gaps/errors? Thank you so much for your help!

$\textbf{Generalized Clairaut-Schwarz Theorem:}$ Let $X$ be open in $\mathbb R^n$, $f:X \to F$, and $m \in \mathbb N$. Suppose $j_1, j_2, \ldots, j_m \in\{1,\ldots,n\}$ and $\sigma$ is a permutation of $\{1, \ldots, m\}$. If $\partial_{j_1} \partial_{j_2} \cdots \partial_{j_m} f$ is continuous at $a$ and $\partial_{j_{\sigma(2)}} \cdots \partial_{j_{\sigma(m)}} f$ exists in a neighborhood of $a$, then $$\partial_{j_1} \cdots \partial_{j_m} f (a)= \partial_{j_{\sigma(1)}} \cdots \partial_{j_{\sigma(m)}} f(a)$$

In my proof, I utilize two below lemmas:

Let $\{e_1,\ldots, e_n\}$ be the standard basis of $\mathbb R^n$. For $h \in \mathbb R$ and $j \in \{1,\ldots,n\}$, we define a map $\Delta_j^h f$ by $$\Delta_j^h f: X \to F, \quad x \mapsto f(x+he_j)-f(x)$$

$\textbf{Lemma 1:}$ $$\partial_{j_1} \cdots \partial_{j_m} f (a) = \lim_{h_1 \to 0} \left ( \lim_{h_2 \to 0} \left( \cdots \left ( \lim_{h_m \to 0} \left( \frac{ \Delta_{j_1}^{h_1} \cdots\Delta_{j_m}^{h_m} f (a)}{h_1 \cdots h_m} \right ) \right ) \cdots \right ) \right)$$

$\textbf{Lemma 2:}$ The finite difference operator is commutative, i.e. $$ \Delta_{j_1}^{h_1} \cdots\Delta_{j_m}^{h_m} f (a) = \Delta_{j_{\sigma(1)}}^{h_{\sigma(1)}} \cdots\Delta_{j_{\sigma(m)}}^{h_{\sigma(m)}} f (a)$$


$\textbf{My attempt:}$

By Mean Value Theorem, we have $$\begin{aligned} & \quad \frac{\Delta_{j_{\sigma(1)}}^{h_{\sigma(1)}} \cdots\Delta_{j_{\sigma(m)}}^{h_{\sigma(m)}} f (a)}{h_{\sigma(1)} \cdots h_{\sigma(m)}}\\ =& \quad \frac{\Delta_{j_1}^{h_1} \cdots\Delta_{j_{m}}^{h_{m}} f (a)}{h_1 \cdots h_m} \quad \text{by} \,\, \textbf{Lemma 2} \\ =& \quad \frac{\partial_{j_{1}} \cdots \partial_{j_{m}} f (a + t_1 e_{j_1} + \cdots + t_{m} e_{j_{m}}) h_1 \cdots h_{m}}{h_1 \cdots h_m} \quad \text{by} \,\, \textbf{MVT} \\ =& \quad\partial_{j_{1}} \cdots \partial_{j_{m}} f (a + t_1 e_{j_1} + \cdots + t_{m} e_{j_{m}}) \end{aligned}$$ in which

$$\begin{aligned} \min\{0,h_1\} < t_1 < \max\{0,h_1\} \\ \vdots\quad\quad\quad\quad\quad\quad \,\,\, \\ \min\{0,h_m\} < t_1 < \max\{0,h_m\} \end{aligned}$$

Hence

$$\begin{aligned} & \quad \left \|\frac{\Delta_{j_{\sigma(1)}}^{h_{\sigma(1)}} \cdots\Delta_{j_{\sigma(m)}}^{h_{\sigma(m)}} f (a)}{h_{\sigma(1)} \cdots h_{\sigma(m)}} - \partial_{j_{1}} \cdots \partial_{j_{m}} f (a) \right \|\\ =& \quad \left \| \partial_{j_{1}} \cdots \partial_{j_{m}} f (a + t_1 e_{j_1} + \cdots + t_{m} e_{j_{m}}) - \partial_{j_{1}} \cdots \partial_{j_{m}} f (a) \right \|\end{aligned}$$

Let $t = |t_1| + \cdots+|t_{m}|$ and $h= |h_1| + \cdots+|h_{m}|$. It follows from the continuity of $\partial_{j_1} \partial_{j_2} \cdots \partial_{j_m} f$ at $a$ that for all $\delta > 0$ there is $\epsilon > 0$ such that $$\left \| \partial_{j_{1}} \cdots \partial_{j_{m}} f (a + t_1 e_{j_1} + \cdots + t_{m} e_{j_{m}}) - \partial_{j_{1}} \cdots \partial_{j_{m}} f (a) \right \| <\ \delta$$ for all $t < \epsilon$. As such, for all $h <\ \epsilon$, we have $$ \left \|\frac{\Delta_{j_{\sigma(1)}}^{h_{\sigma(1)}} \cdots\Delta_{j_{\sigma(m)}}^{h_{\sigma(m)}} f (a)}{h_{\sigma(1)} \cdots h_{\sigma(m)}} - \partial_{j_{1}} \cdots \partial_{j_{m}} f (a) \right \| < \delta$$

Take the limit $h_{\sigma(m)} \to 0$, we have $$\lim_{h_{\sigma(m)} \to 0} \left \|\frac{\Delta_{j_{\sigma(1)}}^{h_{\sigma(1)}} \cdots\Delta_{j_{\sigma(m)}}^{h_{\sigma(m)}} f (a)}{h_{\sigma(1)} \cdots h_{\sigma(m)}} - \partial_{j_{1}} \cdots \partial_{j_{m}} f (a) \right \| \le \delta$$

and consequently $$ \left \| \lim_{h_{\sigma(m)} \to 0} \left (\frac{\Delta_{j_{\sigma(1)}}^{h_{\sigma(1)}} \cdots\Delta_{j_{\sigma(m)}}^{h_{\sigma(m)}} f (a)}{h_{\sigma(1)} \cdots h_{\sigma(m)}} \right ) - \partial_{j_{1}} \cdots \partial_{j_{m}} f (a) \right \| \le \delta$$

and consequently $$\left \| \frac{\Delta_{j_{\sigma(1)}}^{h_{\sigma(1)}} \cdots\Delta_{j_{\sigma(m-1)}}^{h_{\sigma(m-1)}} \partial_{j_{m}} f (a)}{h_{\sigma(1)} \cdots h_{\sigma(m-1)}} - \partial_{j_{1}} \cdots \partial_{j_{m}} f (a) \right \| \le \delta \quad \text{by} \,\, \textbf{Lemma 1}$$

Iterating this process of taking limit, we get $$\left \| \lim_{h_{\sigma(1)} \to 0} \frac{\Delta_{j_{\sigma(1)}}^{h_{\sigma(1)}} \left ( \partial_{j_{\sigma(2)}} \cdots \partial_{j_{\sigma(m)}} f \right) (a)}{h_{\sigma(1)}} - \partial_{j_{1}} \cdots \partial_{j_{m}} f (a) \right \| \le \delta$$ or equivalently $$\left \| \lim_{h_{\sigma(1)} \to 0} \frac{ \left ( \partial_{j_{\sigma(2)}} \cdots \partial_{j_{\sigma(m)}} f \right) (a + h_{\sigma(1)} e_{\sigma(1)}) - \left (\partial_{j_{\sigma(2)}} \cdots \partial_{j_{\sigma(m)}} f \right)(a)}{h_{\sigma(1)}} - \partial_{j_{1}} \cdots \partial_{j_{m}} f (a) \right \| \le \delta$$

For all $\delta >0$, there is $\epsilon >0$ such that for all $|h_{\sigma(1)}| < \epsilon$, the last inequality holds. It follows that $$\partial_{j_{\sigma(1)}}\left (\partial_{j_{\sigma(2)}} \cdots \partial_{j_{\sigma(m)}} f \right)(a) = \partial_{j_{1}} \cdots \partial_{j_{m}} f (a)$$ and consequently $$\partial_{j_{\sigma(1)}} \partial_{j_{\sigma(2)}} \cdots \partial_{j_{\sigma(m)}} f (a) = \partial_{j_{1}} \cdots \partial_{j_{m}} f (a)$$

This completes the proof.

1

There are 1 best solutions below

0
On BEST ANSWER

Thanks to @Pietro for pointing out my fatal misunderstanding of MVT vector-valued function. I've figured a fixed by use the integral form of MVT.


$\textbf{My updated proof:}$

By $\textbf{Lemma 2}$, we have $$ \frac{\Delta_{j_{\sigma(1)}}^{h_{\sigma(1)}} \cdots\Delta_{j_{\sigma(m)}}^{h_{\sigma(m)}} f (a)}{h_{\sigma(1)} \cdots h_{\sigma(m)}} = \frac{\Delta_{j_1}^{h_1} \cdots\Delta_{j_{m}}^{h_{m}} f (a)}{h_1 \cdots h_m}$$

By the integral form of Mean Value Theorem for vector-valued function, we have $$\begin{aligned} & \quad \Delta_{j_1}^{h_1} \cdots \Delta_{j_m}^{h_m} f (a) \\ =& \quad \Delta_{j_1}^{h_1} \cdots \Delta_{j_{m-1}}^{h_{m-1}} \Delta_{j_{m}}^{h_{m}} f (a) \\ =& \quad \Delta_{j_{1}}^{h_{1}} \cdots \Delta_{j_{m-1}}^{h_{m-1}} f(a+ h_{m} e_{j_{m}}) - \Delta_{j_{1}}^{h_{1}} \cdots \Delta_{j_{m-1}}^{h_{m-1}} f(a) \\ = & \quad \int_0^1 \partial_{j_{m}} \Delta_{j_{1}}^{h_{1}} \cdots \Delta_{j_{m-1}}^{h_{m-1}} f(a+ t_m h_m e_{j_{m}})h_{m} \, \mathrm{d} t_m \\ \end{aligned}$$

Similarly, $$\begin{aligned} \quad & \partial_{j_{m}} \Delta_{j_{1}}^{h_{1}} \cdots \Delta_{j_{m-1}}^{h_{m-1}} f(a+ t_m h_m e_{j_{m}}) \\ =\quad & \partial_{j_{m}} \Delta_{j_{1}}^{h_{1}} \cdots \Delta_{j_{m-2}}^{h_{m-2}} f \left (a+ t_m h_m e_{j_{m}} + h_{m-1} e_{j_{m-1}} \right)\\& \quad \quad \quad \quad - \partial_{j_{m}} \Delta_{j_{1}}^{h_{1}} \cdots \Delta_{j_{m-2}}^{h_{m-2}} f \left (a+ t_m h_m e_{j_{m}} \right) \\ = \quad & \int_0^1 \partial_{j_{m}} \partial_{j_{m-1}} \Delta_{j_{1}}^{h_{1}} \cdots \Delta_{j_{m-2}}^{h_{m-2}} f \left (a+ t_m h_m e_{j_{m}} + t_{m-1} h_{m-1} e_{j_{m-1}}\right ) h_{m-1} \, \mathrm{d} t_{m-1} \\ \end{aligned}$$

Iterating the use of the integral form of Mean Value Theorem for vector-valued function, we get $$\Delta_{j_1}^{h_1} \cdots \Delta_{j_m}^{h_m} f (a) = {\int_0^1 \cdots \int_0^1} \partial_{j_{1}} \cdots \partial_{j_{m}} f \left (a+ t_1 h_1 e_{j_{1}} + \cdots+ t_{m} h_m e_{j_{m}}\right ) h_{1} \cdots h_{m} \, \mathrm{d} t_{1} \cdots \, \mathrm{d} t_{m}$$ and consequently $$ \frac{\Delta_{j_1}^{h_1} \cdots\Delta_{j_{m}}^{h_{m}} f (a)}{h_1 \cdots h_m} = {\int_0^1 \cdots \int_0^1} \partial_{j_{1}} \cdots \partial_{j_{m}} f \left (a+ t_1 h_1 e_{j_{1}} + \cdots+ t_{m} h_m e_{j_{m}}\right )\, \mathrm{d} t_{1} \cdots \, \mathrm{d} t_{m}$$ and consequently $$\begin{aligned} &\left \|\frac{\Delta_{j_{\sigma(1)}}^{h_{\sigma(1)}} \cdots\Delta_{j_{\sigma(m)}}^{h_{\sigma(m)}} f (a)}{h_{\sigma(1)} \cdots h_{\sigma(m)}} - \partial_{j_{1}} \cdots \partial_{j_{m}} f (a) \right \|\\ = \quad & \left \| {\int_0^1 \cdots \int_0^1} \Big ( \partial_{j_{1}} \cdots \partial_{j_{m}} f \left (a+ t_1 h_1 e_{j_{1}} + \cdots+ t_{m} h_m e_{j_{m}}\right ) - \partial_{j_{1}} \cdots \partial_{j_{m}} f (a) \Big) \mathrm{d} t_{1} \cdots \, \mathrm{d} t_{m} \right \| \\ \le \quad & {\int_0^1 \cdots \int_0^1} \Big \| \partial_{j_{1}} \cdots \partial_{j_{m}} f \left (a+ t_1 h_1 e_{j_{1}} + \cdots+ t_{m} h_m e_{j_{m}}\right ) - \partial_{j_{1}} \cdots \partial_{j_{m}} f (a) \Big \| \mathrm{d} t_{1} \cdots \, \mathrm{d} t_{m} \\ \end{aligned}$$

Let $h= |h_1| + \cdots+|h_{m}|$. It follows from the continuity of $\partial_{j_1} \partial_{j_2} \cdots \partial_{j_m} f$ at $a$ that for all $\delta > 0$ and $(t_1,\ldots,t_m) \in [0,1]^m$ there is $\epsilon > 0$ such that $$\Big \| \partial_{j_{1}} \cdots \partial_{j_{m}} f (a + t_1 h_1 e_{j_1} + \cdots + t_{m} h_m e_{j_{m}}) - \partial_{j_{1}} \cdots \partial_{j_{m}} f (a) \Big \| <\ \delta$$ for all $h < \epsilon$. As such, for all $h <\ \epsilon$, we have $$ {\int_0^1 \cdots \int_0^1} \Big \| \partial_{j_{1}} \cdots \partial_{j_{m}} f \left (a+ t_1 h_1 e_{j_{1}} + \cdots+ t_{m} h_m e_{j_{m}}\right ) - \partial_{j_{1}} \cdots \partial_{j_{m}} f (a) \Big \| \mathrm{d} t_{1} \cdots \, \mathrm{d} t_{m} < \delta$$ and consequently $$\left \|\frac{\Delta_{j_{\sigma(1)}}^{h_{\sigma(1)}} \cdots\Delta_{j_{\sigma(m)}}^{h_{\sigma(m)}} f (a)}{h_{\sigma(1)} \cdots h_{\sigma(m)}} - \partial_{j_{1}} \cdots \partial_{j_{m}} f (a) \right \| < \delta$$

Take the limit $h_{\sigma(m)} \to 0$, we have $$\lim_{h_{\sigma(m)} \to 0} \left \|\frac{\Delta_{j_{\sigma(1)}}^{h_{\sigma(1)}} \cdots\Delta_{j_{\sigma(m)}}^{h_{\sigma(m)}} f (a)}{h_{\sigma(1)} \cdots h_{\sigma(m)}} - \partial_{j_{1}} \cdots \partial_{j_{m}} f (a) \right \| \le \delta$$

and consequently $$ \left \| \lim_{h_{\sigma(m)} \to 0} \left (\frac{\Delta_{j_{\sigma(1)}}^{h_{\sigma(1)}} \cdots\Delta_{j_{\sigma(m)}}^{h_{\sigma(m)}} f (a)}{h_{\sigma(1)} \cdots h_{\sigma(m)}} \right ) - \partial_{j_{1}} \cdots \partial_{j_{m}} f (a) \right \| \le \delta$$

and consequently $$\left \| \frac{\Delta_{j_{\sigma(1)}}^{h_{\sigma(1)}} \cdots\Delta_{j_{\sigma(m-1)}}^{h_{\sigma(m-1)}} \partial_{j_{m}} f (a)}{h_{\sigma(1)} \cdots h_{\sigma(m-1)}} - \partial_{j_{1}} \cdots \partial_{j_{m}} f (a) \right \| \le \delta \quad \text{by} \,\, \textbf{Lemma 1}$$

Iterating this process of taking limit, we get $$\left \| \lim_{h_{\sigma(1)} \to 0} \frac{\Delta_{j_{\sigma(1)}}^{h_{\sigma(1)}} \left ( \partial_{j_{\sigma(2)}} \cdots \partial_{j_{\sigma(m)}} f \right) (a)}{h_{\sigma(1)}} - \partial_{j_{1}} \cdots \partial_{j_{m}} f (a) \right \| \le \delta$$ or equivalently $$\left \| \lim_{h_{\sigma(1)} \to 0} \frac{ \left ( \partial_{j_{\sigma(2)}} \cdots \partial_{j_{\sigma(m)}} f \right) (a + h_{\sigma(1)} e_{j_{\sigma(1)}}) - \left (\partial_{j_{\sigma(2)}} \cdots \partial_{j_{\sigma(m)}} f \right)(a)}{h_{\sigma(1)}} - \partial_{j_{1}} \cdots \partial_{j_{m}} f (a) \right \| \le \delta$$

For all $\delta >0$, there is $\epsilon >0$ such that for all $|h_{\sigma(1)}| < \epsilon$, the last inequality holds. It follows that $$\partial_{j_{\sigma(1)}}\left (\partial_{j_{\sigma(2)}} \cdots \partial_{j_{\sigma(m)}} f \right)(a) = \partial_{j_{1}} \cdots \partial_{j_{m}} f (a)$$ and consequently $$\partial_{j_{\sigma(1)}} \partial_{j_{\sigma(2)}} \cdots \partial_{j_{\sigma(m)}} f (a) = \partial_{j_{1}} \cdots \partial_{j_{m}} f (a)$$

This completes the proof.