I $\def\d{\mathrm d}\def\p{\partial}$am going to ask everyone to switch their paradigms to that of the real line. I am looking for a "lowbrow" explanation of the following phenomena.
I am talking about the popular method of differentiating functions like $x^x$. First, use the ordinary power rule, then use the exponential rule and then add them together.
$$(x^x)' = x\cdot x^{x-1} + x^x\log(x) = x^x(1+\log(x))$$
Although poorly justified, the result and the method is absolutely correct. From the multivariable chain rule, it can be said that if $y$ depends on functions $f$ and $g$ then
$$\frac{\d y}{\d x} = \frac{\p y}{\p f}\! \frac{\d f}{\d x} + \frac{\p y}{\p g}\! \frac{\d g}{\d x}$$
In our instance, we had $y = f(x)^{g(x)}$ where $f(x) = x$ and $g(x) = x$.
This phenomena can provide simple justification for other rules of differentiation such as the product rule. Assuming this rule, even the generalized product rule is immediately obvious.
I am asking for a way to justify the following statement without explicitly using multivariate calculus beyond partial derivatives.
Theorem: If $f(x)$ can be written explicitly with $n$ instances of $x$ then one may label each of the $x$'s from $x_1\!$ to $x_n$ and then compute $f'(x)$ as follows. $$f'(x) = \sum_{k=1}^{n} \frac{\p f}{\p x_k}$$ And then setting each of the $x_j = x$.
The proof of this need not be formal but I am not looking for anything that treats differentials as fractions. Does anyone have any ideas?
Denote $f$ with "$n$ instances labeled" as $F(x_1, \ldots, x_n)$. Hence we have $F(x,\ldots, x) = f(x)$. Then \begin{align*} f(x+h) - f(x) &= F(x+h, \ldots, x+h) - F(x,\ldots, x)\\ &= \sum_{k=1}^n F(\underbrace{x+h, \ldots, x+h}_{k}, x,\ldots, x) - F(\underbrace{x+h, \ldots, x+h}_{k-1}, x,\ldots, x) \end{align*} Hence $$ \frac{f(x+h) - f(x)}h = \sum_{k=1}^n \frac{ F(\overbrace{x+h, \ldots, x+h}^{k}, x,\ldots, x) - F(\overbrace{x+h, \ldots, x+h}^{k-1}, x,\ldots, x)}h $$ If now $F$ is continuously differentiable, the result follows for $h \to 0$.