Where am I going wrong with using the chain rule to compute the derivative of this composite function?

191 Views Asked by At

Let $g$ be a function of $y_1$ and $y_2$;

$g = f(y_1, y_2) = y_1 \times y_2$. And further let

$y_1 = x^2$ and

$y_2 = 3\sin(x)$

Therefore $g = 3x^2 \sin(x)$. The derivative with respect to $x$ then becomes

$\frac{dg}{dx} = 6x\sin(x) + 3x^2 \cos(x)$.

Now I would like to compute the same derivative using the chain rule. So

$\frac{dg}{dx} = \frac{1}{2}\frac{dg}{dx} + \frac{1}{2}\frac{dg}{dx} = \frac{1}{2}\frac{dg}{dx}\frac{dy_1}{dy_1} + \frac{1}{2}\frac{dg}{dx}\frac{dy_2}{dy_2} = \frac{1}{2}\frac{dg}{dy_1}\frac{dy_1}{dx} + \frac{1}{2}\frac{dg}{dy_2}\frac{dy_2}{dx} = \frac{1}{2}y_2 \times 2x + \frac{1}{2} y_1 \times 3\cos(x) = 3x\sin(x) + \frac{3}{2}x^2\cos(x) $

Which is off by the factor $\frac{1}{2}$.

The algebra breaks down at some point here. Can anyone see where am going wrong?

2

There are 2 best solutions below

2
On BEST ANSWER

The issue arises because you are abusing notation. If you write out all the functions involved, then it is harder to go wrong.


Let $f$ be given by $f(y_1,y_2)=y_1y_2$.

Let $h=(h_1,h_2)$ where $h_1(x)=x^2$ and $h_2(x)=3 \sin(x)$.

You are looking for the derivative of $g=f\circ h $, i.e. the derivative of $g(x)=f(h(x))=f(h_1(x),h_2(x))$

According to the chain rule this is

$$g'(x)=f_1'(h(x))\cdot h_1'(x)+f_2'(h(x))\cdot h_2'(x)\tag{1}$$

where $f_j'$ denotes the partial derivative of $f$ with respect to the $j$th argument.

We have $f_1'(y_1)=y_2$, $f_2'(y_2)=y_2$, $h_1'(x)=2x$, and $h_2'(x)=3\cos (x)$. Thus

$$\begin{aligned}g'(x)&=h_2(x)\cdot (2x)+h_1(x)\cdot (3\cos (x))\\ &=(3\sin (x))\cdot(2x)+x^2\cdot (3\cos (x))\\ &=6x\sin(x)+3x^2\cos(x).\end{aligned}$$


In alternative notation $(1)$ can be written as:

$$\frac{dg}{dx}(x)=\frac{\partial f}{\partial y_1}(h(x))\cdot \frac{dh_1}{dx}(x)+\frac{\partial f}{\partial y_2}(h(x))\cdot \frac{dh_2}{dx}(x)$$

or, with a slight abuse of notation (using $y_j$ to denote both the function $h_j$ and a variable) and suppressing arguments of functions:

$$\frac{dg}{dx}=\frac{\partial f}{\partial y_1}\cdot \frac{dy_1}{dx}+\frac{\partial f}{\partial y_2}\cdot \frac{dy_2}{dx}$$

Compare this with your equation:

$$\frac{dg}{dx} = \frac{1}{2}\frac{dg}{dy_1}\frac{dy_1}{dx} + \frac{1}{2}\frac{dg}{dy_2}\frac{dy_2}{dx} $$ On the left you have used $dg/dx$ to denote the derivative of the composite function given by $3x^2\sin(x)$. On the right you have abused notation and used $d g/d y_j$ to denote the partial derivative of $y_1y_2$ with respect to $y_j$. This abuse of notation is not so much of an issue as long as you recognize it as such. However, I think it is this abuse of notation that has led you to write:

$$\frac{1}{2}\frac{dg}{dx}\frac{dy_1}{dy_1} + \frac{1}{2}\frac{dg}{dx}\frac{dy_2}{dy_2} = \frac{1}{2}\frac{dg}{dy_1}\frac{dy_1}{dx} + \frac{1}{2}\frac{dg}{dy_2}\frac{dy_2}{dx}.$$

0
On

You start by saying $$\dfrac{\mathrm d~~}{\mathrm d x}\hspace{-1.5ex}\raise{1.5ex}{[3x^2\sin(x)]}=6x\sin(x)+3x^2\cos(x)$$

Which was most likely obtained by the product rule.

$$\dfrac{\mathrm d~~}{\mathrm d x}\hspace{-1.5ex}\raise{1.5ex}{[3x^2\sin(x)]}=\sin(x)~\dfrac{\mathrm d~~}{\mathrm d x}\hspace{-1.5ex}\raise{1.5ex}{[3x^2]}+3x^2~\dfrac{\mathrm d~~}{\mathrm d x}\hspace{-1.5ex}\raise{1.5ex}{[\sin(x)]}$$

Then you wanted to compute that by the chain rule. Well, that is really just the same thing.

$$\begin{align}\dfrac{\mathrm d~~}{\mathrm d x}\hspace{-1.5ex}\raise{1.5ex}{[y_1 y_2]}&=\dfrac{\partial~~}{\partial y_1}\hspace{-1.5ex}\raise{1.5ex}{[y_1y_2]}~\dfrac{\mathrm d y_1}{\mathrm d x}+\dfrac{\partial~~}{\partial y_2}\hspace{-1.5ex}\raise{1.5ex}{[y_1y_2]}~\dfrac{\mathrm d y_2}{\mathrm d x}\\&=y_2~\dfrac{\mathrm d y_1}{\mathrm d x}+y_1~\dfrac{\mathrm d y_2}{\mathrm d x}\end{align}$$


However what you did was try to claim: $$\dfrac{\mathrm d g}{\mathrm d x}\dfrac{\mathrm d y_1}{\mathrm d y_1}=\dfrac{\partial g}{\partial y_1}\dfrac{\mathrm d y_1}{\mathrm d x}$$

Which is not valid. Although the Liebniz notation for differentials behave similar enough to quotients for it to be a convenient shorthand, they truly are not the same.