Why can partial derivatives be calculated by keeping other variables constant?

1k Views Asked by At

I understand that the partial derivative with respect to $x$ for a function $f(x,y)$ is represented by $f'((x,y);\vec i)$. In this case, we can calculate it by letting $g(t) = f((x,y)+t\vec i;\vec i$), finding $g'(t)$ and then setting $t=0$. Why is it that we can also calculate this partial derivative by considering $y$ as a constant. I know it works, but I am confused as to why.

2

There are 2 best solutions below

0
On BEST ANSWER

Suppose we want to evaluate $f'((x,y);\vec \imath)$ at the arbitrary point $(x_0,y_0).$ Or more succinctly, we want to evaluate $f'((x_0,y_0);\vec \imath),$ where $(x_0,y_0)$ is an arbitrary point. While we are evaluating this derivative, naturally the values of $x_0$ and $y_0$ do not change, so effectively they are constants during that evaluation.

So now to find $f'((x_0,y_0);\vec \imath)$ your method says to evaluate $g'(t)$ where $g(t) = f((x_0,y_0)+t\vec \imath).$ But note that $(x_0,y_0)+t\vec \imath = (x_0 + t, y_0).$ So we can just as well write $g(t) = f(x_0 + t,y_0).$

But let's take this one step farther. Let's make a change of variables: introduce the variable $x$ where $x = x_0 + t.$ Then we can define a function $h$ such that $h(x) = f(x,y_0) = f(x_0 + t,y_0) = g(t).$

Now we can use a little single-variable calculus. In particular, the chain rule with $p(t) = x_0 + t$ tells us that $$ \frac{d}{dt}h(x) = \frac{d}{dt}h(p(t)) = h'(p(t)) \, p'(t) = h'(p(t)) = h'(x), $$

and therefore $$g'(t) = \frac{d}{dt}g(t) = \frac{d}{dt}h(x) = h'(x) .$$

So to get $g'(t)$ you can evaluate $h'(x).$ But $h'(x) = \frac{d}{dx}f(x,y_0),$ that is, it's what you get if you hold the second argument of $f$ constant, effectively giving you a single-variable function over the first argument, and differentiate with respect to the first argument.


I think in fact the idea of differentiating as a function of one of the arguments while holding the other arguments constant is the older, more established way of defining partial derivatives, and the definition by means of a directional derivative is a relatively new idea. So perhaps we should be trying to prove that "we can also calculate" the old-style partial derivative using the definition you learned! But I think you can easily enough reverse the steps I took above.

4
On

If $g(x,y)$ denotes the partial derivative of $f(x,y)$ with respect to $x$, then by definition $$ f(x+\epsilon,y)=f(x,y)+\epsilon \cdot g(x,y) + o(\epsilon). $$ Note that $y$ is not changed at all in this equation, only $x$. Thus, we can ignore $y$ (more precisely, hold it constant) when taking the partial derivative with respect to $x$.

In other words, if we consider a constant $c$ and substitute the constant $y=c$ into the function $f(x,y)$ to obtain a new function $h(x)=f(x,c)$ then $h'(x)=g(x,c)$.