Using chain rule for derivative of multivariable function

154 Views Asked by At

I've been trying to learn more about the chain rule of a multivariable function, but I'm a bit confused on a special case I've encountered. I'm given a single valued function

$\displaystyle f(x_1,x_2,x_3)$

evaluated at $x_3 = g(x_1,x_2)$, which is also a single valved function. Note that $x_n$ are all scalars. I think I would summarize the functions as $f:\mathbb{R}^3\rightarrow \mathbb{R}$ and $g:\mathbb{R}^2\rightarrow \mathbb{R}$. I'm going to represent the function composition as

$\displaystyle h(x_1,x_2) = f(x_1,x_2,g(x_1,x_2)) \equiv (f \circ g)(x_1,x_2)$

(EDIT: The equivalence above is wrong...see answer/comments)

in which $h:\mathbb{R}^2\rightarrow \mathbb{R}$. I'm told to find the derivative of $h$ with respect to $x_2$. However, I want to write down the general expression for the derivative. Following the link above (and material within), I can use the so called derivative operator $\textbf{D}$ and write

$\displaystyle \textbf{D}h = \textbf{D}f|_{x_3=g}\textbf{D}g$

where I've introduced the notation $|_{x_3=g}$ to denote "is evaluated at $x_3 = g(x_1,x_2)$" and dropped the other function variables for clarity. What I don't follow is how to write this in terms of vector/matrix products. In some of the linked material, it says the derivative of a scalar valued function is expressed as a $1 \times n$ row vector. Therefore,

$\displaystyle \textbf{D}h = \left[ \frac{\partial h}{\partial x_1} \quad \frac{\partial h}{\partial x_2}\right]$

$\displaystyle \textbf{D}f|_{x_3=g} = \left[ \frac{\partial f}{\partial x_1}|_{x_3=g} \quad \frac{\partial f}{\partial x_2}|_{x_3=g} \quad \frac{\partial f}{\partial x_3}|_{x_3=g} \right]$

$\displaystyle \textbf{D}g = \left[ \frac{\partial g}{\partial x_1} \quad \frac{\partial g}{\partial x_2}\right]$

I know the answer, specific to the question: "find the derivative of $h$ with respect to $x_2$" is

$\displaystyle \frac{\partial h}{\partial x_2} = \frac{\partial f}{\partial x_2}|_{x_3=g} + \frac{\partial f}{\partial x_3}|_{x_3=g} \frac{\partial g}{\partial x_2}$

so, my question is not about the answer. I'm trying to see how I would arrive at that using vector/matrix products? Specifically, the form "$\displaystyle \textbf{D}h = \textbf{D}f|_{x_3=g}\textbf{D}g$" is easy to remember (from single variable days), and if I can understand how to use it for all special cases, that would be helpful (using the case presented here to showcase).

1

There are 1 best solutions below

7
On BEST ANSWER

Define $G(x,y)=(x,y,g(x,y))$ and consider $f \circ G=f(x,y,g(x,y))$. Now we have

$DG= \begin{bmatrix} 1 & 0 \\ 0 & 1 \\ g_x & g_y \end{bmatrix} $ and $Df=[f_x, f_y, f_z]$. Now $Df \circ DG = [f_x + f_z g_x , f_y+f_z g_y]$ and we get what you have, $h_y=f_y+f_z g_y$

The problem was that you didn't set up the $G$ correctly.