[Beginning calculus question.] We can get an approximation to the value of a function of two variables, I think, by saying
$$ f(a+\Delta x , b+ \Delta y) \approx f(a,b) + f_x(a,b)\Delta x +f_y(a,b)\Delta y .$$
I am visualizing this by bending a piece of paper into a bendy surface, and putting a rigid clipboard tangent to it. I know the height of the paper at a point (a,b), according to an imagined coordinate system in the room.
If I want to know the height of a point on the surface nearby $f(a+\Delta x , b+ \Delta y)$, I could use that formula, by moving up the clipboard along the $x$ direction in the amount $f_x$ and up the $y$ direction in the amount $f_y$.
However, if I first move up the appropriate amount along the $x$ axis, $f_y(x+\Delta x, y)$ could be different than it was at $f_y(x,y)$. But the approximation formula doesn't seem to take notice of that. It seems like I should be using a different $y$-derivative (without loss of generality), when I am moving along 2 axes, than if I were only moving along one.
What am I missing? Why does the approximation formula work like that?
Your reasoning and your visualization technique are correct. In effect, you are approximating the function by a parallelogram (the clipboard). This might not be a very good approximation, but it's the best you can do using first derivative information at a single point.
You're right that $f_y(x+\Delta x, y)$ might be different from $f_y(x,y)$, but the approximation process assumes that we only know what's happening at $(x,y)$, not at $(x+\Delta x, y)$ or any other point. If you had information about second partial derivatives at $(x,y)$, you could use this to get a better estimate for $f_y(x+\Delta x, y)$, instead of just assuming that it's the same as $f_y(x,y)$.