Single variable justification for the multivariate chain rule.

225 Views Asked by At

I $\def\d{\mathrm d}\def\p{\partial}$am going to ask everyone to switch their paradigms to that of the real line. I am looking for a "lowbrow" explanation of the following phenomena.

I am talking about the popular method of differentiating functions like $x^x$. First, use the ordinary power rule, then use the exponential rule and then add them together.

$$(x^x)' = x\cdot x^{x-1} + x^x\log(x) = x^x(1+\log(x))$$

Although poorly justified, the result and the method is absolutely correct. From the multivariable chain rule, it can be said that if $y$ depends on functions $f$ and $g$ then

$$\frac{\d y}{\d x} = \frac{\p y}{\p f}\! \frac{\d f}{\d x} + \frac{\p y}{\p g}\! \frac{\d g}{\d x}$$

In our instance, we had $y = f(x)^{g(x)}$ where $f(x) = x$ and $g(x) = x$.

This phenomena can provide simple justification for other rules of differentiation such as the product rule. Assuming this rule, even the generalized product rule is immediately obvious.

I am asking for a way to justify the following statement without explicitly using multivariate calculus beyond partial derivatives.

Theorem: If $f(x)$ can be written explicitly with $n$ instances of $x$ then one may label each of the $x$'s from $x_1\!$ to $x_n$ and then compute $f'(x)$ as follows. $$f'(x) = \sum_{k=1}^{n} \frac{\p f}{\p x_k}$$ And then setting each of the $x_j = x$.

The proof of this need not be formal but I am not looking for anything that treats differentials as fractions. Does anyone have any ideas?

2

There are 2 best solutions below

4
On BEST ANSWER

Denote $f$ with "$n$ instances labeled" as $F(x_1, \ldots, x_n)$. Hence we have $F(x,\ldots, x) = f(x)$. Then \begin{align*} f(x+h) - f(x) &= F(x+h, \ldots, x+h) - F(x,\ldots, x)\\ &= \sum_{k=1}^n F(\underbrace{x+h, \ldots, x+h}_{k}, x,\ldots, x) - F(\underbrace{x+h, \ldots, x+h}_{k-1}, x,\ldots, x) \end{align*} Hence $$ \frac{f(x+h) - f(x)}h = \sum_{k=1}^n \frac{ F(\overbrace{x+h, \ldots, x+h}^{k}, x,\ldots, x) - F(\overbrace{x+h, \ldots, x+h}^{k-1}, x,\ldots, x)}h $$ If now $F$ is continuously differentiable, the result follows for $h \to 0$.

0
On

In other words, you are expressing $f(x)$ as

$$ f(x) = g(x, x, \ldots, x) $$

and then differentiating both sides with respect to $x$ gives, using the multivariable chain rule,

$$ f'(x) = \sum_{i=1}^n g_i(x, x, \ldots, x) $$

as desired, where $g_i$ means the derivative of $g$ in its $i$-th argument. (I greatly prefer this notation to the usual partial-derivative notation)

(is this what you're looking for, or were you instead looking for a justification of the multivariable chain rule?)


As an aside, if you allow the use of differentials, then your original description can be given more literally:

$$ \mathrm{d} g(x_1, x_2, \ldots, x_n) = \sum_i g_i(x_1, x_2, \ldots, x_ n) \mathrm{d} x_i $$

and then because differentials respect equations among the variables, imposing the equations $x_i = x$ (and thus $\mathrm{d}x_i = \mathrm{d}x$) gives the differentiation rule you're looking for, using the facts that $\mathrm{d}f(x) = f'(x) \mathrm{d}x$ and that $g(x) \mathrm{d}x = 0$ implies $g(x) = 0$.