Decomposing a function's variable and adding up the partials of the parts equals the original partial?

64 Views Asked by At

Theorem

Let $f(x)$ and $g(x_1, x_2, \ldots, x_n)$ be differentiable and equal when $x_1 = x_2 = \ldots = x_n = x$. Then

$$\frac{\partial f}{\partial x} = \frac{\partial g}{\partial x_1} + \frac{\partial g}{\partial x_2} + \ldots + \frac{\partial g}{\partial x_n}$$

when $x_1 = x_2 = \ldots = x_n = x$.


Example

\begin{align*} f(x) &= x^3 + x^2 + x \\ g(x_1, x_2, x_3) &= x_1 x_2 x_3 + x_1 x_2 + x_1 \end{align*}

Now the sums of partials are shown to equal the partial of the original polynomial when all the $x_i$'s are equal $x$.

\begin{align*} \frac{\partial f}{\partial x} &= \frac{\partial g}{\partial x_1} + \frac{\partial g}{\partial x_2} + \frac{\partial g}{\partial x_3} \\ 3x^2 + 2x + 1 &= (x_2 x_3 + x_2 + 1) + (x_1 x_3 + x_1) + (x_1 x_2) \\ &= (x_1 x_2 + x_1 x_3 + x_2 x_3) + (x_1 + x_2) + 1\\ &= 3x^2 + 2x + 1 \tag*{$x_i = x$} \end{align*}


Use Case

The Backpropagation Through Time algorithm used for RNNs seems to assume this when it calculates the partial of the error function $E$ with with respect to a certain weight matrix by adding the partials with respect to the matrix at each time step.

$$\frac{\partial E}{\partial W_{hh}} = \frac{\partial E}{\partial W_{hh_t}} + \frac{\partial E}{\partial W_{hh_{t-1}}} + \ldots +\frac{\partial E}{\partial W_{hh_{t-s}}}$$

Here $W_{hh}$ is the weight matrix between the hidden layers of two timesteps, $t$ is the latest timestep, and $s$ is the number of timesteps backwards at which the backpropagation is truncated.


Question

What is this property of partials called? And where might I find a proof of it?

Or alternatively, how might it be proven?

1

There are 1 best solutions below

0
On BEST ANSWER

Define $\mathbf{X}(x)=(x,...,x)$, now we have that $f(x) =g(\mathbf{X}(x))$, try taking an $x$ derivative of both sides making sure to use the multivariable chain rule and you'll obtain your theorem!