Why does the Jacobian matrix
$$ J = \begin{pmatrix} \frac{\partial f_1}{\partial x_1} & \dots & \frac{\partial f_1}{\partial x_n} \\ \vdots & \ddots & \vdots \\ \frac{\partial f_n}{\partial x_1} & \dots & \frac{\partial f_n}{\partial x_n} \end{pmatrix} $$
work, and where does it come from?
I came across this matrix in a multivariable calculus context where it was used to do multivariate substitution. It seems so arbitrary to me and I don't understand where it comes from. Can anyone give some insight or intuition about this?
In functions of one variable, the derivative is the slope of the tangent line to graph of $f(x)$.
The tangent line to the curve $y=f(x)$ at $x=a$ is given by,
$h(x)=f(a)+f'(x)(x-a)$
As $x \rightarrow a$, $f(x)$ approaches the tangent line $h(x)$. On a computer algebra system like Mathematica, if you zoom in at the point $x=a$, $f(x)$ looks more and more like the tangent line $h(x)$. $f'(x)$ is it's slope.
In fact, a function is said to be differentiable, if there exists the limit :
$$\lim_{x \rightarrow a} \frac{f(x)-h(x)}{x-a}=0$$
The Jacobian matrix plays the role of the derivative of a vector-valued function $\mathbf{f}$,
$$\mathbf{f}=(f_1(x_1,\ldots,x_n),f_2(x_1,\ldots,x_n),\ldots,f_m(x_1,\ldots,x_n))$$
of $n$ input variables and $m$ output variables.
For concreteness assume, $m=1$, $n=2$, that is a function of two variables.
Analogous to the single variable case, the tangent plane to a surface $z=f(x,y)$ at the point $(x_0,y_0)$ is given by,
$h(x,y)=f(a,b)+Df(x,y)\cdot (x-a,y-b)$
where $Df(x,y)=\left[\frac{\partial f}{\partial x},\frac{\partial f}{\partial y}\right]^T$ is the Jacobian matrix.
As $(x,y) \rightarrow (a,b)$, the surface $f(x,y)$ approaches the tangent plane $h(x,y)$. On Mathematica, if you zoom in at the point $(a,b)$, $f(x,y)$ looks more and more like the tangent plane. $\partial f/\partial x$ is the increase in the function value, for small bump $\Delta x$. $\partial f/\partial y$ is the increase in the function value, for small bump $\Delta y$.
On similar lines, a vector valued function is said to be differentiable, if there exists the limit :
$$\lim_{\mathbf{x} \rightarrow \mathbf{a}} \frac{\mathbf{f(x)}-\mathbf{h(x)}}{||\mathbf{x}-\mathbf{a}||}=0$$