Differential of multi variable function

34 Views Asked by At

I'm trying to find the differential of the following multi variable function and then use the external definition of gradient in order to find its gradient: $$ f(\overline{x})=\phi(A\bar{x}) \,,where\,\bar{x}\in\mathbb{R}^{m},\,A\in\mathbb{R}^{n\times m}\,,\phi:\mathbb{R}^{n}\rightarrow\mathbb{R},\,f:\mathbb{R}^{m}\rightarrow\mathbb{R} $$

My calculation is:

let $u=Ax$, then $$df\underbrace{=}_{definition\,of\,differential}\frac{\phi(u)}{du}\cdot du\underbrace{=}_{substitution+chain\,rule}\frac{d\phi(Ax)}{dAx_{1}}\cdot dAx_{!}\cdot\underbrace{Adx_{1}}_{inner\,derivative}+\ldots+\frac{d\phi(Ax)}{dAx_{n}}\cdot dAx_{n}\cdot\underbrace{Adx_{n}}_{inner\,derivative}=\Sigma_{i=1}^{n}\frac{d\phi(Ax)}{dAx_{i}}\cdot dAx_{i}\cdot\underbrace{Adx_{i}}_{inner\,derivative}\underbrace{=}_{inner\,product\,definition}\langle A^{T}(\nabla\phi(Ax))^{T},dAxdx\rangle $$

What's my mistake here and how should it be done correctly?

1

There are 1 best solutions below

3
On

Converting to indices makes things easier. Let $A=A^j_{\:i}$ (rows then columns) and let $$u^j = \sum_i A^j_{\:i} \: x^i$$ so that

$$ f(x) = \phi(Ax) =\phi(u) \text{ .}$$ By definition of the differential, we have that

$$df = \sum_k \dfrac{\partial f}{\partial x^k} dx^k $$ and thus

$$ df = \sum_k \dfrac{\partial \phi}{\partial x^k} dx^k \text{ .}$$

By the chain rule, we know that

$$\dfrac{\partial \phi(u(x))}{\partial x^k} = \sum_j\dfrac{\partial \phi}{\partial u^j} \dfrac{\partial u^j}{\partial x^k} $$ therefore

$$ df = \sum_k \sum_j\dfrac{\partial \phi}{\partial u^j} \dfrac{\partial u^j}{\partial x^k} dx^k \text{ .}$$

Looking more closely at $\dfrac{\partial u^j}{\partial x^k}$, the product rule gives us that

$$ \dfrac{\partial u^j}{\partial x^k} = \dfrac{\partial (\sum_i A^j_{\:i} \: x^i)}{\partial x^k} = \sum_i\bigg(\dfrac{\partial A^j_{\:i} \:}{\partial x^k} x^i + A^j_{\: i}\dfrac{\partial x^i}{\partial x^k}\bigg)$$ where $\dfrac{\partial x^i}{\partial x^k} $ equals $1$ or $0$ depending on if $i=k$ thus

$$\dfrac{\partial u^j}{\partial x^k} = \sum_i\bigg(\dfrac{\partial A^j_{\:i} \:}{\partial x^k} x^i + A^j_{\: k}\bigg) \text{ .}$$

We can now substitute this into $df$ so

$$ df = \sum_k \sum_j \sum_i\dfrac{\partial \phi}{\partial u^j}\bigg(\dfrac{\partial A^j_{\:i} \:}{\partial x^k} x^i + A^j_{\: k}\bigg) dx^k \text{ .} $$

If you are using a cartesian coordinate system, the the components of $df$ are the components of the gradient. In other words, the gradient can be indexed as

$$ (\nabla f)_k = \sum_j \sum_i\dfrac{\partial \phi}{\partial u^j}\bigg(\dfrac{\partial A^j_{\:i} \:}{\partial x^k} x^i + A^j_{\: k}\bigg) \text{ .}$$