Differentiating a simple, single-variable equation involving a vector

68 Views Asked by At

Please forgive how simple this is, but I can't seem to find any explanations for how to differentiate single-variable equations of the following form:

$f(\boldsymbol{x}) = 5\boldsymbol{x}$, where $\boldsymbol{x}$ is a $n \times 1$ vector of real values; i.e., $\boldsymbol{x} = \langle a_1, a_2, a_3, ... a_n \rangle$ for $a_i \in \mathbb{R}$.

1

There are 1 best solutions below

0
On

The differential of a mapping $f: \mathbb{R}^n \rightarrow \mathbb{R}^n$ at a point $p$, if it exists, is a linear transformation $df_p: \mathbb{R}^n \rightarrow \mathbb{R}^n$ which best approximates the change in $f$ near $p$. In particular, the differential $df_p$ is implicitly defined by the Frechet quotient: $$ \lim_{h \rightarrow 0} \frac{f(p+h)-f(p)-df_p(h)}{\| h \|} = 0$$ For small $h$, $f(p+h) \simeq f(p) + df_p(h)$. The relation of the differential and the partial derivatives more commonly taught in introductory calculus is given by the definition $\frac{\partial f}{\partial x_i}(p) = df_p(e_i)$ where $(e_i)_j = \delta_{ij}$ or equivalently $e_i \cdot e_j = \delta_{ij}$. Here I use $e_1,e_2,\dots , e_n$ to denote the standard basis for $\mathbb{R}^n$. Incidentally, this definition of partial derivatives equally well applies to a basis for some abstract finite dimensional normed linear space. That said, $\| h \| = \sqrt{ h \cdot h}$ is the length of $h$. Notice, we cannot divide by $h$ since generally division by a vector is not defined. Getting back to the main story, $$ J_f(p) = [df_p] = [df_p(e_1)|df_p(e_2)| \cdots | df_p(e_n)] = \left[ \frac{\partial f}{\partial x_1}(p)\bigg{|}\frac{\partial f}{\partial x_2}(p)\bigg{|}\cdots \bigg{|}\frac{\partial f}{\partial x_n}(p) \right] $$ is the Jacobian matrix of $f$ at $p$. The relation between $df_p$ and $J_f(p)$ is given by matrix multiplication: $$ df_p(h) = J_f(p)h $$ We can view the Jacobian as a stack of gradient vectors, one for each component function of $f = (f_1,f_2, \dots , f_n)$; $\nabla f_j = [\partial_1 f_j, \dots , \partial_n f_j]^T$ and $$ J_f = \left[ \begin{array}{c} (\nabla f_1)^T \\ (\nabla f_2)^T \\ \vdots \\ (\nabla f_n)^T \end{array}\right] $$ Thus, $$ df_p(h) = \left[ \begin{array}{c} (\nabla f_1)^T \\ (\nabla f_1)^T \\ \vdots \\ (\nabla f_n)^T \end{array}\right]\left[ \begin{array}{c} h_1 \\ h_2 \\ \vdots \\ h_n \end{array}\right] = \left[ \begin{array}{c} (\nabla f_1) \cdot h \\ (\nabla f_2)\cdot h \\ \vdots \\ (\nabla f_n) \cdot h\end{array}\right]. $$ In fact, the derivative (differential) of $f$ involves many gradients at once working in concert as above. You see, the larger confusion here is the tendency for students to assume the derivative of a function on $\mathbb{R}^n$ should be another function on $\mathbb{R}^n$. It's not. The first derivative is naturally identified with the pointwise assignment of a linear map at each such point as the Frechet quotient exists. Then, it turns out the higher derivatives of a function on $\mathbb{R}^n$ can be identified with the pointwise assignment of a completely symmetric $k$-linear mapping. These things are explained rather nicely in Volume 2 of Zorich's Mathematical Analysis. However, this material is standard in any higher course in multivariate analysis.

Getting back to your actual function $f(x) = 5x$, this function is linear so the best linear approximation to the function is essentially the function itself. We can calculate $J_f(p) = 5I_n$ where $I_n$ is the $n \times n$ identity matrix. Or, if you prefer, $df_p(h) = 5I_nh = 5h$ for each $p \in \mathbb{R}^n$.