Can't make sense of taking partial derivatives of functions of related variables in exact differential equations

499 Views Asked by At

From khan academy:
consider the following FO differential equation: $$ y\cos(x)+2xe^y+(\sin(x)+x^2e^y-1)y' = 0 $$ The instructor has given the following solution to this differential equations, stating it is an an exact one: $$ y\sin(x)+x^2e^y-y=C $$ Note that it follows from the latter and from the statement of the problem that $y$ is a function of $x$. For any chosen constant $C$, there must be a unique $x$ corresponding to each value of $y$, and the other way round.

Now, at some point he states that we check the equation for being an exact one by taking the partial derivative of the first sum with respect to $x$: $$ \frac{\partial}{\partial x} (y\cos(x)+2xe^y) $$ He further states that we treat $y$ as a constant for this purpose.

How may this ever make sense??? $y$ is deterministic and depends on $x$, and it's going to change when we change $x$. Holding it constant is just like dividing by zero. Where is my misunderstanding?

2

There are 2 best solutions below

0
On BEST ANSWER

Sorry for the mega late update on this. I'd like to point out that Allawonder's answer has many useful insights of it but I still would like to expand a bit. Just to make myself clear, I'd like to first revisit the source of my confusion w.r.t the problem at hand.

Namely, I didn't understand: 1) in course of solving the differential equation at hand, we are looking for a function $y = f(x)$ whereby $(x,y)$ is an infinite sets of pairs satisfying this function/equation. What I want to say is that $x$ and $y$ are related: $y$ depends on $x$ and vice versa. 2) Then, at some point, for reasons that @Allawonder has described nicely, we differentiate $f(x,y)$ w.r.t $x$, holding $y$ constant. What I couldn't realiaze is how we can possibly hold $y$ constant if it is related to $x$ by the very fact that we are looking for a function $y=f(x)$ where the both variables are clearly related.

I think, what @Allawonder didn't mention is the concept of implicit differentiation which I guess is the key to understanding here.

Let a function $$z = f(x,y) = -(x^2+y^2)$$ be given. This would look something like this: ClickToSee (sorry, I am not yet allowed to embed pictures)

For such a function, it makes sense to differentiate w.r.t $x$ holding $y$ constant, and vice versa. I.e., very simply, we are looking at how $f(x,y)$ changes if we are only changing $x/y$. But we are not quite there yet.

Let us now take a look at a level of this function. In other words, let $$f(x,y) = -(x^2+y^2) = k = 3$$ Graphically, it is like cutting our 3-dimensional surface with a horizontal plane, and considering all the points at the intersection of the plane and the surface. Graphically, it would look something like this: CutSurface You see that this set of points, if we look at it from above, represents the familiar circle, which is described in two dimensions by $$f(x,y) = -(x^2+y^2) = k$$

For such function, we may first notice that, although both $x$ and $y$ appear at the left hand side, they are related. If $x$ changes, $y$ has to change to keep the function equal to $k$. For such expression, implicit differentiation is taking derivative w.r.t to a variable of the LHS with the knowledge that both variables are dependent now. Interestingly, this is not the end.

Let us take a derivative of $$-(x^2+y^2) = k$$ w.r.t $x$: $$\frac{\partial(-(x^2+y^2))}{\partial(x)}=2x + 2y*\frac{\partial y}{\partial x}=\frac{\partial f(x,y)}{\partial x} + \frac{\partial f(x,y)}{\partial y}*\frac{\partial y}{\partial x}$$ where the first term $2x$ is just the derivative of $z=f(x,y)=-(x^2+y^2)$ holding $y$ constant, and $2y$ is the partial derivative of $z=f(x,y)=-(x^2+y^2)$ holding $x$ constant. Once again in words: an implicit derivative of a (single-variate) function $f(x,y) = k$ is the derivative of the (multi-variate) function $z=f(x,y)$ w.r.t $x$ holding $y$ constant plus the derivative of the (same) (multi-variate) function $z=f(x,y)$ w.r.t $y$ holding $x$ constant times the derivative of $y$ w.r.t $x$. I think, that's where we see the link between single-variate and two-variate calculus in this case. Because we see how the implicit derivative of a single-variate function is represented by the partial derivatives of the multi-variate function, the level of which the single-variate function represents $(-(x^2+y^2) = k$ is a level of $z = -(x^2+y^2)$. Probably, this is very intuitive for non-newbies, but was kind of discovery for me.

We then use the fact that the implicit derivative contains the term $$\frac{\partial y}{\partial x}$$ to solve the differential equation as Allawonder has nicely described. I.e., when initially solving such a differential equation, we suppose that it by coincidence represents an implicit derivative of a function. Hope this will be helpful for anyone facing the same confusion as I did.

3
On

Naturally, this wouldn't make much sense unless you've first studied multivariable calculus. There, in the two variable case for example (which is what's relevant here anyway), you learn that the derivative (as it were) of a function $\phi(x,y)$ is given by a two-dimensional vector. This is usually called the gradient of the function $\phi.$

Now from single-variable calculus, you may recall that an interesting question is to solve the equation $y'=f(x)$ for the unknown $y,$ with $f(x)$ specified. This corresponds to finding primitives of $f(x),$ or in more common parlance, integrating $f(x).$ A similar question in the multi-variable case is to construct a function given its gradient vector. Just as in single-variable calculus it is shown that not every function is the derivative of some function (for short, a function that skips intermediate values cannot represent the derived function of some function), we have a similar result that says not every bidimensional vector is the gradient of some bivariable function. The condition that such vectors must satisfy in order to be considered as gradients is that their cross derivatives must be equal and continuous. That is, given the vector $$\left(f(x,y),g(x,y)\right),$$ then it is the gradient of some $\phi(x,y)$ provided that $\partial f_y=\partial g_x,$ both partials being continuous.

This is the test usually applied to check so-called exact ordinary differential equations of first order; such can be put in the form $$f(x,y)\mathrm d x+g(x,y)\mathrm d y=0,$$ whose left-hand side reminds one of the so-called total differential $\mathrm d \phi$ of a function $\phi(x,y),$ where $(f,g)$ would then be the gradient of some such $\phi.$ The test given is just the one from calculus to see whether in fact this is the total differential of some $\phi.$ Thus we cross-differentiate, that is, we differentiate the first component with respect to the second component (not with respect to the first again, as you stated above) of $(x,y),$ and vice versa. This corresponds to checking to see whether the mixed second order partials are equal, which is necessary for the expression to represent the differential of some function, and then the integration can be carried out as usual.


PS. As for your question about the reasonability of taking partial derivatives; this, as well as the process of solving exact differential equations, would only be more clear after having first studied the calculus of more than a single variable, as I've said above. However, if you want to differentiate a function of two independent variables, say $xy,$ how should you go about it? In single-variable calculus, this question is not addressed, as a matter of course. But one might for a trial take derivatives with respect to each of the variables, treating the other as a constant in the meantime, and this we already know how to do since then the functions are varying with just a single variable. Thus if you differentiate $xy$ with regard to $x$ alone, treating $y$ as constant, you have $y.$ This is called the (first) partial derivative of $xy$ with respect to $x.$ The other partial (relative to $y$) is $x.$ This is a start for the investigation of how one may actually define a concept for functions of several variables analogous to the notion of derivative in single-variable calculus.