To get the normal vector of a tangent plane at a given point $(x,y,z)$ on a surface $U$, we can use the gradient function $\nabla$ on the equation of the surface to tell us what its components are.
Also, if we want the change in a $dU$ function $U$, then we have
$$dU = \frac{\partial U}{\partial x} dx +\frac{\partial U}{\partial y} dy +\frac{\partial U}{\partial z} dz $$ $$ = \nabla U \cdot d\mathbf{l} $$
$$ = |\nabla U||d\mathbf{l}|\ \mathrm{cos}\theta$$ where $d\mathbf{l} = \mathbf{i}\ dx +\mathbf{j}\ dy+\mathbf{k}\ dz$ and $\theta$ is the angle between the vectors $d\mathbf{l}$ & $\nabla U $
meaning for the largest change in $U$ we should have $\theta=0$ so our $d\mathbf{l}$ is in the exact same direction as $\nabla U $.
So $\nabla U$ is a vector pointing in the direction in which $U$ changes the quickest - but surely $U$ is not changing in the direction normal to itself at all? (Which contradicts the reality described at the top of this question).
Can you see where my understanding falters?