Spivak Physics for Mathematicians - Help understanding the derivative of a one-parameter family of linear transformations.

361 Views Asked by At

Most physics textbooks I've consulted only confused me more with regards to how to understand rotating coordinate systems so I consulted Spivak's book but I have 2 questions; in particular regarding the derivative near the bottom of the page.

$r' = B(\rho') + B'(\rho)$

Even if we write it out in uncondensed notation, the formula reads $r'(t) = B(t)(\rho'(t)) + B'(t)(\rho(t))$.

Firstly I am not sure how he gets this formula, because it seems like he used a product rule of sorts, which I am not sure how/why it even applies.

And secondly, I'm not sure how to interpret the term $B'(t)$, because from my understanding, $B$ is a function from $\mathbb{R}_{\geq0}$ into Hom$(\mathbb{R}^3)$, so how can we differentiate such an object? I know that the derivative of a function from say from $\mathbb{R}^n$ into $\mathbb{R}^m$ is a linear transformation, but what about the case where the target space is a general vector space?

So if someone could explain this step in more detail, that would be much appreciated :)

Taken from Michael Spivak's book "Physics for Mathematicans"

2

There are 2 best solutions below

2
On BEST ANSWER

First we define differentiability and derivative of a map between finite-dimensional real vector spaces. You should have seen the following definition of differentiability when the vector spaces are $\Bbb R^m,\Bbb R^n$. Let $U$ be an open subset of $\Bbb R^m$. We say that $f:U\to\Bbb R^n$ is differentiable at a point $p\in U$ if there exists a necessarily unique linear transformation $L:\Bbb R^m\to\Bbb R^n$ such that

$$\lim\limits_{h\to0}\frac{\lVert f(p+h)-f(p)-L(h)\rVert_{\Bbb R^n}}{\lVert h\rVert_{\Bbb R^m}}=0,$$

where $h$ is an $\Bbb R^m$ vector. $L$ would be called the derivative of $f$ at $p$.

One thing to note about the above definition is that while we usually use the standard inner product norms on $\Bbb R^m, \Bbb R^n$ in the above definition, we can use any other norms on $\Bbb R^m, \Bbb R^n$ in place of $\lVert\cdot\rVert_{\Bbb R^m},\lVert\cdot\rVert_{\Bbb R^n}$, and the (non-)differentiability of $f$ and derivative of $f$ is the same no matter which norms you use. This is because all norms on finite-dimensional real/complex vector spaces are equivalent, and equivalent norms give us invariant notion of differentiability and derivative. See for https://math.stackexchange.com/a/2183054/356114.

Now, as for arbitrary finite-dimensional vector spaces and maps between them, you can first give each of the vector spaces a norm, and then define differentiability and derivative in the same manner as in $\Bbb R^n$ case. Again, differentiability and derivative of $f$ do not depend on the choice of norms.

As a note, the following abuse of notation will be used. For a differentiable function $f:U\to V$ whose domain $U$ is an open subset of $\Bbb R$. This is very specific. It has to be $\Bbb R$, not any non-one-dimensional vector space, and not any other one-dimensional vector space. The derivative at a point $p$ is a linear map $L:\Bbb R\to V$, which we denote $L=:f'(p)$. Consider the vector $v\in V$ defined by $f'(p)(1)=:v$. We shall abuse notation by saying that $f'(p)$ is the vector $v$ itself.

Now we proceed to product rule. The formula $r'=B(\rho')+B'(\rho)$ is indeed product rule. Product rule actually applies much more generally than what you may have seen. Whenever you have finite-dimensional real vector spaces $U,V,W$, a function $f:U\times V\to W$ that is bilinear, that is, for all $c\in\Bbb R, u_1,u_2,u\in U,v_1,v_2,v\in V$, we have $$f(cu_1+u_2,v)=cf(u_1,v)+f(u_2,v),$$and$$f(u,cv_1+v_2)=cf(u,v_1)+f(u,v_2),$$ then you can derive a product rule involving $f$, as follows.

Say you have differentiable functions $g:\Bbb R\to U, h:\Bbb R\to V$, then the function $\Bbb R\to W, t\mapsto f(g(t),h(t))$ is differentiable and the derivative is $f(g'(t),h(t))+f(g(t),h'(t))$. This is proved with chain rule applied to the composition of $t\mapsto(g(t),h(t))$ and $f$.

The map of applying a linear transformation to a vector, $(B,\rho)\mapsto B(\rho)$ is a bilinear function. So you can use product rule when differentiating $r$.

1
On

When in "static" linear algebra we replace the standard basis $({\bf e}_i)_{1\leq i\leq 3}$ of $V={\mathbb R}^3$ by a new basis $({\bf u}_i)_{1\leq i\leq 3}$ the necessary data are stored in a matrix $B$, whereby the columns of $B$ are the coordinates of the new basis vectors in terms of the standard basis. If the new basis is again orthonormal the matrix $B$ is orthogonal, and one has $B^{-1}=B^\top$ .

The points ${\bf r}\in V$ are not moved around, but they obtain new coordinates. Let ${\bf r}$ have coordinates $(r_1,r_2,r_3)$ with respect to the standard basis and coordinates $(\rho_1,\rho_2,\rho_3)$ with respect to the new basis. It is customary to write these coordinate triples as column vectors ${\bf r}$ (one is overloading this symbol here), resp. ${\pmb \rho}$. Under this convention we have the formulas $${\bf r}=B{\pmb\rho},\qquad {\pmb\rho}=B^{-1}{\bf r}\ .\tag{1}$$

Now in the quoted passage of Spivak's book the standard basis $({\bf e}_i)_{1\leq i\leq 3}$ is viewed as a rigid frame that has been actually rotated to the new position $({\bf u}_i)_{1\leq i\leq 3}$. But the individual points of space have not been moved. Therefore the formulas $(1)$ are still valid under this view of things.

So far everything was "static". But now we set the film in motion: (i) We have a point $t\mapsto{\bf r}(t)\in V$ that describes an orbit in time, and (ii) the fixed new frame $({\bf u}_i)_{1\leq i\leq 3}$ is replaced by a rotating frame $${\bf u}_i(t)=B(t) {\bf e}_i\qquad(1\leq i\leq3)\ .$$ It follows that the formulas $(1)$ are holding for each fixed $t$: $${\bf r}(t)=B(t){\pmb\rho}(t),\qquad {\pmb\rho}(t)=B^{-1}(t){\bf r}(t)\ .\tag{2}$$ The matrix products appearing in $(2)$ are bilinear, and this implies the product rule when differentiating with respect to $t$. Therefore we obtain, e.g., $${\bf r}'(t)=B'(t){\pmb\rho}(t)+B(t){\pmb\rho}'(t)=B'(t)B^{-1}(t){\bf r}(t)+B(t){\pmb\rho}'(t)\ ,$$ whereby the exact purpose of this last formula remains unclear to me.