Reformulating Real Analysis using any basis of $\mathbb{R}^p$

113 Views Asked by At

To illustrate the question that I'm asking, consider the theorem:

If the partial derivatives of the $f$ exists in a neighbourhood of $c$ and are continuous at $c$, then $f$ is differentiable at $c$.

However, when we talk about partial derivatives, we talk about the directional derivative in the direction of the standard basis vectors (0,...,1,...0)of $\mathbb{R}^p$.

First of all, why do we choose and formulate the theorems with respect to these basis ? I mean would any orthonormal basis of $\mathbb{R}^p$ work in the same way ?

I mean, for example, if we think geometrically, for $\mathbb{R}^3 $the choice of $x,y,z$ coordinates is arbitrary, hence any orthonormal basis of $\mathbb{R}^3$ should work, but how about any basis of $\mathbb{R}^3$ ?

Secondly, if we were to formulate, for example, the above theorem wrt any basis of $\mathbb{R}^p$, how could we do that ?

Edit:

Please provide an argument to your statements.

Edit 2:

In the book that I'm using (The elements of Real Analysis by Bartle), for example, the above theorem is given before stating that the derivative of $f$ can be written in terms of the gradient, so to answer to this question, I cannot use this fact.

4

There are 4 best solutions below

0
On BEST ANSWER

The basis you chose to work with does matter. Indeed, consider a basis $v_1,\ldots, v_n$ of $\mathbb R^n$. Rewriting the Fréchet derivative, for any $h=\sum_{k=1}^n h_i v_i$,

$$df(x)(h)=df(x)(\sum_{k=1}^n h_iv_i)=\sum_{k=1}^n h_i df(x)(v_i)$$

If you define $\frac{\partial f}{\partial x_i}(x)$ as $df(x)(v_i)$ then you get the formula $df(x)(h) = \sum_{k=1}^n h_i \frac{\partial f}{\partial x_i}(x)$.

However, you may not write this last quantity as $\nabla f(x)^Th$ because the $h_i$ are not necessarily the coordinates of $h$ in the canonical basis.

6
On

Any basis would suffice for this theorem. We just use the standard basis because its the easiest to work with.

The best way to look at this is given that the theorem holds for a basis $v_i$ of $\mathbb{R}^p$ and $e_j = \sum a_{i,j}v_i$ then $\frac{\partial f}{\partial x_j} = \sum a_{i,j}\frac{\partial f}{\partial v_i}$.

2
On

Let $\{v_1,\dots,v_n\}$ be an orthonormal basis of $\mathbb{R}^n$ and let $U$ be the matrix whose $i$-th column is $v_i$. Notice that $U$ is orthogonal. Let $g=f\circ U$. Then

\begin{align} g_i(x) &=\lim_{t\to0}\frac{g(x+t\,e_i)-g(x)}t\\ &=\lim_{t\to0}\frac{f(U(x+t\,e_i))-f(Ux)}t\\ &=\lim_{h\to0}\frac{f(Ux+tUe_i))-f(Ux)}t=f_{v_i}(Ux) \end{align} In particular, $f_{v_i}(x)=g_i(U^Tx)$. Do you think you can take it from here?


Notice that we used $U^T$ in the last step, but only because $U$ is orthogonal. The same argument works fine provided $U$ is invertible, that is, provided $\{v_1,\dots,v_n\}$ is a basis.

8
On

First, your quoted theorem

If the partial derivatives of the $f$ exists in a neighbourhood of $c$ and are continuous at $c$, then $f$ is differentiable at $c$.

May or may not actually be true depending on how you define differentiable. If your are referring to Gâteaux differentiability then you are correct, but surprisingly not if you mean Fréchet differentiable. See this post for a counterexample.

Your question actually touches on something very subtle. Consider the Jacobian of the function $f$--that is, the $p \times p$ matrix $J$ (in the standard basis), defined by

$$ J_{ij} = \frac{\partial f_i}{\partial x_j}. $$

Where $f_i$ is the $i$th component of $f$ and $x_j$ the $j$th coordinate in the standard basis. Suppose now we transform from the standard basis to a new basis by the transformation $T : (x_j) \mapsto (x_j')$. We now have a new function

$$ \tilde{f} = T\circ f\circ T^{-1}. $$

(That is $\tilde{f}$ first takes the vector in the new coordinates and converts back to the old, applies $f$, and then converts back to the new coordinates.) Using the chain rule, it can readily be deduced that the new Jacobian $J'$ is related to the old by

$$ J' = PJP^{-1}, $$

where $P$ is the change of basis matrix of the transformation $T$. Thus, we are free to work in whichever coordinates we may like, and may readily convert to a different set of linear coordinates by applying the standard change of basis matrix from linear algebra. We are thus at liberty to take the standard basis because, as this argument shows, all the other partial derivatives in any other linear coordinate system exist and are given by a change of basis with respect to the standard coordinates.


Addendum: This reasoning also explains a certain puzzle: when we encounter a vector function $\mathbf{u} : \Bbb R^3 \to \Bbb R^3$ in physics, there are nine derivatives $\partial u_i/\partial x_j$ for $i,j = 1,2,3$. Why then do we only ever encounter two particular expressions $\nabla \cdot \mathbf{u}$ and $\nabla \times \mathbf{u}$ in our physics books?

It is because these combinations of partial derivatives are (in some sense) the only "derivatives" invariant under orthogonal transformations (and $\nabla \times \mathbf{u}$ is only "kind of" invariant under coordinate transformations, see this.)

We can see that $\nabla \cdot \mathbf{u}$ is invariant under coordinate transformations since it is the trace of the Jacobian, and the trace of a matrix is invariant with respect to change of basis.