I've read dozens of questions and answers here regarding the usage of unit vectors when calculating directional derivative, and I get that it's only a convention and not mandatory.
But the tendency to use unit vectors in the first place has still not clicked for me, so I'd like to use a concrete example given in this video (part of a Khan Academy video series, made by the creator of 3Blue1Brown):
Given $f(x,y) = x^2y$, and $\vec v = \begin{bmatrix}{} \frac{\sqrt2}{2}\\\frac{\sqrt2}{2}\end{bmatrix}$, the directional derivative of $f$ at point $(-1, -1)$ is: $$ \nabla_{\vec v} f(-1,-1) = \begin{bmatrix}{} 2\\1\end{bmatrix} \cdot \begin{bmatrix}{} \frac{\sqrt2}{2}\\\frac{\sqrt2}{2}\end{bmatrix} = \sqrt2 + \frac{\sqrt2}{2} $$
However (this part starts here):
This only works if your vector is a unit vector, because [...] if you scale $\vec v$ by 2 [...] the derivative will become twice the value, [i.e.] $2\nabla f \cdot \vec v$, but you don't necessarily want that, because the plane that you sliced it with - if instead of doing it in the direction of $\vec v$ the unit vector, you did it in the direction of $2\vec v$ - it's the same plane, it's the same slice you're taking, and you'd want that same slope - so that's gonna mess everything up. This is super important if you're thinking about things in the context of slope.
I hope I'm not just being pedantic here, as I feel I'm missing something here, but why does "this only works" if $\vec v$ is the unit vector and otherwise it "mess everything up"?
- Is it just because by definition, directional derivative should not include a scaling factor ($2$, in this case)?
- Is it just because by definition, a slope is the rate of change per unit?
Or is there something more fundamental to this citation?
