Why is the magnitude of the gradient equal to the maximum rate of change at that point?

384 Views Asked by At

I understand the concept of the gradient being a vector of the partials of f with respect to each variable, so essentially the gradient gives you a direction in the input field to travel in order to get the maximum increase in the function f. What I don't understand is why the magnitude of that gradient is the actual maximum rate of change of f - it feels right that it should be but I can't quite join the dots and see a proper reason for it.