I understand the steps to calculate the total derivative of f(x, g(x))
Related: Derivative of $f(x, g(x))$ with respect to $x$
I have three sub-questions related to calculating the total derivative of f(x, g(x, y)),
(1) How do I calculate its total derivative, here's my attempt:
$$ df=\Big(\frac{\partial{f}}{\partial x}+\frac{\partial{g}}{\partial x}\Big)dx+\frac{\partial{f}}{\partial y}dy $$
So applying a simple example of f(x, x+y) where g(x,y)=x+y $$ df=\Big(\frac{\partial{f}}{\partial x}+1\Big)dx+\frac{\partial{f}}{\partial y}dy $$
(2) Why do I not need to consider higher order terms? Looking at Taylor Series would it make it more accurate?
(3) In terms of approximating the total derivative, is this logic correct?
$$ df(x, x+y) = f(x+\Delta x, y+\Delta y) - f(x,y) \approx \Big(\frac{f(x+\Delta x, x+\Delta x + y)-f(x, x+y)}{\Delta x}+1\Big)\Delta x + \Big(\frac{f(x, x + y + \Delta y)-f(x, x+y)}{\Delta y}\Big)\Delta y $$
Aside: Higher Order Terms
This is a largely-unrelated question that I will not address here; I just want to point out that it applies equally well in a single variable case like $\mathrm dy=\dfrac{\mathrm dy}{\mathrm dx}\,\mathrm dx$. The question is probably mostly addressed by Why isn't $df=\frac{\partial f}{\partial x}\:dx+\frac{\partial f}{\partial y}\:dy$ defined to resemble a Taylor series further? and its comments/answers.
Working with Differentials
The General Case
Set $z=f(u,v)$, $u=x$, and $v=g(x,y)$.
Then $\mathrm{d}z=\dfrac{\partial f}{\partial u}\mathrm{d}u+\dfrac{\partial f}{\partial v}\mathrm{d}v$ by how differentials/the multivariate chain rule works.
And $\mathrm{d}v=\dfrac{\partial g}{\partial x}\mathrm{d}x+\dfrac{\partial g}{\partial y}\mathrm{d}y$ for the same reason.
And then (for good measure) $\mathrm{d}u=\dfrac{\partial u}{\partial x}\mathrm{d}x+\dfrac{\partial u}{\partial y}\mathrm{d}y=1\mathrm{d}x+0\mathrm{d}y=\mathrm{d}x$.
Therefore, $$\mathrm{d}z=\dfrac{\partial f}{\partial u}(\mathrm{d}x)+\dfrac{\partial f}{\partial v}\left(\dfrac{\partial g}{\partial x}\mathrm{d}x+\dfrac{\partial g}{\partial y}\mathrm{d}y\right)=\left(\dfrac{\partial f}{\partial u}+\dfrac{\partial f}{\partial v}*\dfrac{\partial g}{\partial x}\right)\mathrm{d}x+\dfrac{\partial f}{\partial v}*\dfrac{\partial g}{\partial y}\mathrm{d}y\text{.}$$
This shorthand may be a little unclear since there are both $v$s and $y$s in the final expression, so suppose we're at a point $(x,y)=(a,b)$. Then $u=x=a$ and $v=g(x,y)=g(a,b)$. We then have:
$\begin{align*} \left.\mathrm{d}z\right|_{\left(x,y\right)=\left(a,b\right)} & =\left(\left.\dfrac{\partial f}{\partial u}\right|_{\left(u,v\right)=\left(a,g(a,b)\right)}+\left.\dfrac{\partial f}{\partial v}\right|_{\left(u,v\right)=\left(a,g(a,b)\right)}*\left.\dfrac{\partial g}{\partial x}\right|_{\left(x,y\right)=\left(a,b\right)}\right)\mathrm{d}x\\ & \phantom{=}+\left.\dfrac{\partial f}{\partial v}\right|_{\left(u,v\right)=\left(a,g(a,b)\right)}*\left.\dfrac{\partial g}{\partial y}\right|_{\left(x,y\right)=\left(a,b\right)}\mathrm{d}y \end{align*}\tag{$\star$}$
(Note that if we had used $\dfrac{\partial f}{\partial x}$ in place of $\dfrac{\partial f}{\partial u}$, it would have been easier to make mistakes like calculating $\left.\dfrac{\partial f}{\partial u}\right|_{\left(u,v\right)=\left(a,b\right)}$ in place of $\left.\dfrac{\partial f}{\partial u}\right|_{\left(u,v\right)=\left(a,g(a,b)\right)}$, and similarly for the other partial derivative of $f$.)
The Specific Case
In the case in which $g(x,y)=x+y$ so that $\dfrac{\partial g}{\partial x}\equiv\dfrac{\partial g}{\partial y}\equiv1$, things reduce to:
$$\left.\mathrm{d}z\right|_{\left(x,y\right)=\left(a,b\right)}=\left(\left.\dfrac{\partial f}{\partial u}\right|_{\left(u,v\right)=\left(a,a+b\right)}+\left.\dfrac{\partial f}{\partial v}\right|_{\left(u,v\right)=\left(a,a+b\right)}\right)\mathrm{d}x+\left.\dfrac{\partial f}{\partial v}\right|_{\left(u,v\right)=\left(a,a+b\right)}\mathrm{d}y$$
Note that the first $\left.\dfrac{\partial f}{\partial v}\right|_{\left(u,v\right)=\left(a,a+b\right)}$ term (multiplying $\mathrm{d}x$) is not "$1$", so nothing like $\mathrm{d}f=\left(\dfrac{\partial f}{\partial x}+1\right)\mathrm{d}x+\dfrac{\partial f}{\partial y}\mathrm{d}y$ is true.
Approximating the Partials, too
The General Case
The meaning of $(\star)$ for a practical approximation is that $$\left.z\right|_{\left(x,y\right)=\left(a+\Delta a,b+\Delta b\right)}-\left.z\right|_{\left(x,y\right)=\left(a,b\right)}\approx\text{[the right side of }\star\text{ but with }\Delta a,\Delta b\text{ in place of }\mathrm{d}x,\mathrm{d}y\text{].}$$ In other words, $f\left(a+\Delta a,g(a+\Delta a,b+\Delta b)\right)-f\left(a,g(a,b)\right)\approx\cdots$.
Now, if you also want to approximate those partial derivatives in this formula, with difference quotients, this gets very ugly, but we can do it.
Choosing small resolutions $\Delta u$, $\Delta v$, $\Delta x$, $\Delta y$ (all four of which need not have anything to do with $\Delta a$ and $\Delta b$), then we get:
\begin{align*} & f\left(a+\Delta a,g(a+\Delta a,b+\Delta b)\right)-f\left(a,g(a,b)\right)\\ \approx & \left(\left.\dfrac{\partial f}{\partial u}\right|_{\left(u,v\right)=\left(a,g(a,b)\right)}+\left.\dfrac{\partial f}{\partial v}\right|_{\left(u,v\right)=\left(a,g(a,b)\right)}*\left.\dfrac{\partial g}{\partial x}\right|_{\left(x,y\right)=\left(a,b\right)}\right)\Delta a\\ & +\left.\dfrac{\partial f}{\partial v}\right|_{\left(u,v\right)=\left(a,g(a,b)\right)}*\left.\dfrac{\partial g}{\partial y}\right|_{\left(x,y\right)=\left(a,b\right)}\Delta b\\ \approx & \left(\dfrac{f\left(a+\Delta u,g(a,b)\right)-f\left(a,g(a,b)\right)}{\Delta u}+\dfrac{f\left(a,g(a,b)+\Delta v\right)-f\left(a,g(a,b)\right)}{\Delta v}*\dfrac{g(a+\Delta x,b)-g(a,b)}{\Delta x}\right)\Delta a\\ & +\dfrac{f\left(a,g(a,b)+\Delta v\right)-f\left(a,g(a,b)\right)}{\Delta v}*\dfrac{g(a,b+\Delta y)-g(a,b)}{\Delta y}\Delta b \end{align*}
Now, if we choose to make $\Delta x=\Delta a$ and $\Delta y=\Delta b$ (somewhat reasonable since they are both changes in the same inputs), this simplifies to:
\begin{align*} = & \dfrac{f\left(a+\Delta u,g(a,b)\right)-f\left(a,g(a,b)\right)}{\Delta u}\Delta a+\dfrac{f\left(a,g(a,b)+\Delta v\right)-f\left(a,g(a,b)\right)}{\Delta v}*\left(g(a+\Delta a,b)-g(a,b)\right)\\ & +\dfrac{f\left(a,g(a,b)+\Delta v\right)-f\left(a,g(a,b)\right)}{\Delta v}*\left(g(a,b+\Delta b)-g(a,b)\right) \end{align*}
And if we further choose to make $\Delta u=\Delta a$ (very reasonable since $u=x$ so $\Delta u=\Delta x=\Delta a$ makes sense), we get:
\begin{align*} = & f\left(a+\Delta a,g(a,b)\right)-f\left(a,g(a,b)\right)+\dfrac{f\left(a,g(a,b)+\Delta v\right)-f\left(a,g(a,b)\right)}{\Delta v}*\left(g(a+\Delta a,b)-g(a,b)\right)\\ & +\dfrac{f\left(a,g(a,b)+\Delta v\right)-f\left(a,g(a,b)\right)}{\Delta v}*\left(g(a,b+\Delta b)-g(a,b)\right) \end{align*}
With $v=g(x,y)$, we might want $\Delta v=g(a+\Delta a,b)-g(a,b)$ for cancellation in the first product and $\Delta v=g(a,b+\Delta b)-g(a,b)$ for cancellation in the second product. Unfortunately, for most choices of $g,\Delta a,$ and $\Delta b$, those expressions are not equal. However, if we can control $\Delta b$ or $\Delta a$ (or both), then for many values of $\left(a,b\right)$ and functions $g$, we might be able to arrange this by choosing $\Delta a$ and/or $\Delta b$ appropriately.
The Specific Case
For example, if $g(x,y)=x+y$ and we set $\Delta b=\Delta a$, then $g(a,b+\Delta b)-g(a,b)=a+b+\Delta b-(a+b)=\Delta b=\boxed{\Delta a}=(a+\Delta a+b)-(a+b)=g(a+\Delta a,b)-g(a,b)$. In this very special case of $g(x,y)=x+y$, if we set $\Delta v=\Delta b=\Delta a$ we get the following further simplification: \begin{align*} = & f\left(a+\Delta a,g(a,b)\right)-f\left(a,g(a,b)\right)+f\left(a,g(a,b)+\Delta a\right)-f\left(a,g(a,b)\right)\\ & +f\left(a,g(a,b)+\Delta v\right)-f\left(a,g(a,b)\right)\\ = & f\left(a+\Delta a,a+b\right)+2f\left(a,a+b+\Delta a\right)-3f\left(a,a+b\right) \end{align*}
Putting this all together, we have (in this very special case of $g(x,y)=x+y$): $$f\left(a+\Delta a,a+b+2\Delta a\right)-f\left(a,a+b\right)\approx f\left(a+\Delta a,a+b\right)+2f\left(a,a+b+\Delta a\right)-3f\left(a,a+b\right)$$ This can be simplified and rearranged to: $$f\left(a+\Delta a,a+b+2\Delta a\right)-f\left(a+\Delta a,a+b\right)\approx2\left(f\left(a,a+b+\Delta a\right)-f\left(a,a+b\right)\right)$$
You can confirm this numerically for your favorite choice of differentiable $f$ and small number $\Delta a$.
It also makes some intuitive sense since: The left side is approximately $f\left(a,a+b+2\Delta a\right)-f\left(a,a+b\right)$ by continuity in the first input, and that is approximately the right side by approximate-linearity (i.e. differentiability) in the second input.