So in a lot of math that I do, I find that unless I have some intuition or deep understanding of a concept, I won't get very far with just memorization alone. Our textbook introduced the gradient to us but didn't do a very good job of explaining why it's defined as $<f_x(x, y) , f_y(x, y)>$.
I do want to note that I don't know a lot of complex math, and I don't do well with abstract symbols and/or terminology. Could someone please kindly help me prove the gradient vector's definition?
The point of the derivative is to approximate a function by a linear (actually affine) approximation of the form $f(x,y) \approx f(x_0,y_0) + f_x(x_0,y_0) (x-x_0) + f_y(x_0,y_0) (y-y_0)$. We can write this as $f(x,y) \approx f(x_0,y_0) + (f_x(x_0,y_0),f_y(x_0,y_0)) \bullet (x-x_0, y-y_0) $.
That is, the difference $f(x,y)-f(x_0,y_0)$ can be approximated by the inner product of the vector $(f_x(x_0,y_0),f_y(x_0,y_0)) $ and the perturbation $(x-x_0, y-y_0) $. The vector $(f_x(x_0,y_0),f_y(x_0,y_0)) $ is called the gradient.