I am new to calculus and am trying to work out the following question, with no success so far… Any feedback would be great!
Within function $f(x,y)$, variable $y$ is a function of $(x,z)$, in other words $y=g(x,z)$. Therefore, we have function $φ(x,z)$, defined as $φ(x,z)=f(x,g(x,z))$.
Given are:
$f(1,3)=2$, $f_x(1,3)=4$, $f_y(1,3)=1$
$g(1,2)=3$, $g_x(1,2)=2$, $g_z(1,2)=-2$
Calculate $\left\lVert \nabla φ(1,2) \right\rVert$.
All conditions for derivation are fulfilled.
Not all data are necessarily required for solving the exercise.
a. $\sqrt{34}$
b. $\sqrt{36}$
c. $\sqrt{38}$
d. $\sqrt{40}$
e. $\sqrt{42}$
I am pretty much lost here. To the best of my understanding all the data about (1,3) are irrelevant. Furthermore, $\nablaφ=(f_x,f_y·y_x+f_y·y_z)$, where $y_x$ and $y_z$ are equal to $g_x$ and $g_z$ respectively. As such, no matter what, the second part of the gradient cancels itself out, leaving me with $\nablaφ=(f_x,0)$. I am not sure if any of this is correct, but either way it does not lead me to the answer. Can anybody please help out?
I find it easier to keep track of things by working with differentials and using a positional notation for partial derivatives: if $f:\mathbb R^n\to\mathbb R$, $\partial_k f$ is the partial derivative of $f$ with respect to the $k$th variable. The differential of $f$ at $\mathbf p$ is denoted by $df_{\mathbf p}$ and the chain rule is $d(f \circ g)_{\mathbf p} = df_{g(\mathbf p)} \circ dg_{\mathbf p}$. If we have $f:\mathbb R^n\to\mathbb R^m$ and fix coordinate systems for the domain and codomain, the linear map $df_{\mathbf p}$ is represented by the familiar Jacobian matrix of $f$, $\nabla f(\mathbf p)$ is the transpose of the matrix of $df_{\mathbf p}$, and the chain rule becomes matrix multiplication.
In this case, we have $f,g:\mathbb R^2\to\mathbb R$. It will also be useful to introduce an intermediate function $$h:\pmatrix{x\\z}\mapsto\pmatrix{x\\g(x,z)}$$ so that $\varphi = f \circ h$. We then have
$$df_{\mathbf p} = \pmatrix{\partial_1 f(\mathbf p) & \partial_2 f(\mathbf p)} \\ dg_{\mathbf p} = \pmatrix{\partial_1 g(\mathbf p) & \partial_2 g(\mathbf p)} \\ dh_\mathbf p = \pmatrix{1&0 \\ \partial_1 g(\mathbf p) & \partial_2 g(\mathbf p)}$$ so that (with some small abuses of notation)
$$\begin{align} d\varphi_{(1,2)} &= df_{h(1,2)} \circ dh_{(1,2)} \\ &= \pmatrix{\partial_1 f(1,3) & \partial_2 f(1,3)} \pmatrix{1&0 \\ \partial_1 g(1,2) & \partial_2 g(1,2)} \\ &= \pmatrix{\partial_1 f(1,3)+\partial_2f(1,3) \partial_1 g(1,2) & \partial_2f(1,3) \partial_2 g(1,2)}. \end{align}$$ Switching back to notation that’s more familiar to you, this says that $$\varphi_x(1,2) = f_x(1,3) + f_y(1,3) g_x(1,2) = 4 + 1\cdot2 = 6 \\ \varphi_z(1,2) = f_y(1,3) g_z(1,2) = 1\cdot(-2) = -2,$$ therefore $\lVert\nabla\varphi(1,2)\rVert = \sqrt{40}$.
Notice that $df$ is taken at $h(1,2)=(1,3)$. This is because we need the differential at the point at which $f$ is evaluated. To evaluate $(f\circ h)(1,2)$, we first apply $h$ to get $h(1,2)=(1,3)$ then evaluate $f$ to this point to get $f(h(1,2))$. It’s the same thing with the differentials. To compute the linear approximationto the change in $f\circ h$ at $(1,2)$, we follow the maps along by first applying $dh_{(1,2)}$ to the displacement and then applying $df_{h(1,2)}$ to the result.