How to take total derivative of $F^2 (x, \lambda \dot x)$ with respect to a scalar?

62 Views Asked by At

I have a function $F^2 (x, \lambda \dot x)$, which is a scalar function:

  • $F^2 (x, \lambda \dot x): \mathbb{R}^n \rightarrow \mathbb{R}$
  • $\lambda$ is a scalar. $\lambda \in \mathbb{R}$
  • $x \in \mathbb{R}^n$

From what my colleague writes on the board,

$\frac{d }{d \lambda} F^2 (x, \lambda \dot x)= \frac{\partial F^2 (x, \lambda \dot x)}{\partial \dot x} * \dot x$

and

$\frac{d^2 }{d \lambda ^2}F^2 (x, \lambda \dot x) = \dot x^T \frac{\partial^2 }{\partial \dot x^2} F^2 (x, \lambda \dot x) * \dot x$.

I am trying to justify his math by using this definition of the total derivative:

$\frac{d }{d \lambda} F^2 (x, \lambda \dot x)=\frac{\partial }{\partial \lambda} F^2 (x, \lambda \dot x)+\frac{\partial }{\partial \lambda \dot x} F^2 (x, \lambda \dot x)\frac{\partial }{\partial \lambda} \lambda \dot x$

By using that definition of the total derivative, then we seem to be saying the following:

1) $\frac{\partial }{\partial \lambda} F^2 (x, \lambda \dot x)=0$ (and this is because $\lambda$ is a scalar)

2) $\frac{\partial }{\partial \lambda \dot x} F^2 (x, \lambda \dot x) = \frac{\partial }{ \lambda \partial \dot x} F^2 (x, \lambda \dot x)= \frac{\partial }{\partial \dot x} F^2 (x, \lambda \dot x) $

Now, I am just looking for a little help with the math here. Are the justifications 1) and 2) correct? If not, then how does my colleague justify this?

As a final question, why is it not true that $\frac{\partial }{\partial \dot x} F^2 (x, \lambda \dot x)= 2 F (x, \lambda \dot x) * \lambda$? When I apply the chain rule, as I am accustomed to doing, I am getting $\frac{\partial }{\partial \dot x} F^2 (x, \lambda \dot x)= 2 F (x, \lambda \dot x) * \lambda$. I am assuming the reason we do not apply the chain rule here is because we are taking the total derivative and not the partial derivative. Is that true?

Thank you very much for helping me, I know this is basic stuff!

2

There are 2 best solutions below

2
On

$$\frac{d}{d\lambda}F^2(x,\lambda\dot x)=\frac{\partial}{\partial x}F^2(x,\lambda\dot x)\cdot\frac{\partial}{\partial\lambda} x + \frac{\partial}{\partial\lambda\dot x}F^2(x,\lambda\dot x)\cdot \frac{\partial}{\partial\lambda}\lambda\dot x =\frac{\partial}{\partial\lambda\dot x}F^2(x,\lambda\dot x) \cdot \dot x$$ since $x$ is independent of $\lambda$. Note that we derivate $F^2$ along the coordinates and multiply this with the according inner derivative.

Edit: As MasterYoda correctly derived in his answer $$\frac{\partial}{\partial\lambda\dot x}F^2(x,\lambda\dot x) = \frac{1}{\lambda}\frac{\partial}{\partial\dot x}F^2(x,\lambda\dot x)\,.$$ A slightly different way to see this, is by considering the function $f(\cdot):=F^2(x,\cdot)$. Note, the partial derivative of $F^2$ with respect to the second component equals the (total) derivative of $f$. Then, the chain rule gives $$\frac{\partial}{\partial\dot x}F^2(x,\lambda\dot x)= \frac{\partial}{\partial\dot x}f(\lambda\dot x)=\lambda f’(\lambda\dot x)=\lambda \frac{\partial}{\partial\lambda\dot x}F^2(x,\lambda\dot x)\,.$$ Now, rearrange.

2
On

Interpreting what your colleague wrote, it appears $F^2$ is itself a function and not the square of $F$, correct? If not, simply replace $F^2$ with $G$. Otherwise, applying chain rule to $F^2$ would be

$$\begin{align} \frac{d}{d\lambda}F^2(x,\lambda\dot x) &= \frac{\partial F^2(x,\lambda\dot x)}{\partial\lambda} + \frac{\partial F^2(x,\lambda\dot x)}{\partial x}\frac{\partial x}{\partial \lambda} + \frac{\partial F^2(x,\lambda\dot x)}{\partial\lambda\dot x}\frac{\partial \lambda\dot x}{\partial \lambda} \\&= \frac{1}{\lambda} \frac{\partial F^2(x,\lambda\dot x)}{\partial\dot x}\dot x \end{align}$$

To answer 1), notice the first term is zero because $F^2$ is not a function of $\lambda$ alone. Instead $\lambda$ is tied to $\dot x$. Also second term is zero because $dx/d\lambda=0$ due to independence of $\lambda$. As for your colleague making the claim that 2) $\partial/\partial\lambda\dot x = \partial/\partial\dot x$ may be wrong. You can wave your hand and say that $\lambda\dot x$ is in the same direction as $\dot x$, which may be true, but I think doing that would make the result be off by a scaling factor $1/\lambda$. Here is my long but mathematically appropriate (I think) justification: Let $y = \lambda\dot x$

$$\frac{\partial}{\partial\lambda\dot x} = \frac{\partial}{\partial y} = \frac{d\dot x}{dy}\frac{\partial}{\partial\dot x} = \frac{d(y/\lambda)}{dy}\frac{\partial}{\partial\dot x} = \frac{1}{\lambda}\frac{\partial}{\partial\dot x}$$

Now for the second derivative. For derivatives of vectors it is simplest to do it element wise. So let $\dot x_i$ be the $i^{th}$ element of the vector $\dot x$.

$$\begin{align} \frac{d^2}{d\lambda^2}F^2(x,\lambda\dot x) &= \frac{d}{d\lambda} \left[\frac{1}{\lambda} \frac{\partial F^2(x,\lambda\dot x)}{\partial\dot x}\dot x_i\right] \\ &= \frac{1}{\lambda} \frac{\partial F^2(x,\lambda\dot x)}{\partial\dot x}\frac{d\dot x_i}{d\lambda} + \frac{1}{\lambda}\frac{d}{d\lambda} \left[\frac{1}{\lambda} \frac{\partial F^2(x,\lambda\dot x)}{\partial\dot x}\right]\dot x_i \\ &= \frac{1}{\lambda^2}\frac{\partial^2 F^2(x,\lambda\dot x)}{\partial\dot x^2}\dot x_i^2 \\ &\implies \boxed{\frac{d^2}{d\lambda^2}F^2(x,\lambda\dot x)=\frac{1}{\lambda^2}\dot x^T\frac{\partial^2 F^2(x,\lambda\dot x)}{\partial\dot x^2}\dot x} \end{align}$$