Say we have a function
\begin{cases} f: \mathbb{R}^n \to \mathbb{R}\\ f(\mathbf{v}) = \mathbf{u}^\top \mathbf{v} \end{cases}
where $\mathbf{u} \in \mathbb{R}^n$.
Apparently taking the partial derivate of $f$ with respect to $\mathbf{v}$ yields $\mathbf{u}$:
$$\frac{\partial f}{\partial \mathbf{v}} = \mathbf{u}$$
Why is that? This makes no sense to me. As $f$ returns real numbers, the rate of change in $f$ should be a real number, I would have assumed. Why is the rate of change a vector? Vectors are not even part of co-domain of $f$.
Also, what subject do I need to look into for this? I just got confronted with that isolated claim that $\frac{\partial f}{\partial \mathbf{v}} = \mathbf{u}$, here.
$\frac {\partial f}{\partial \textbf v}$ is a shorthand for $\left(\frac {\partial f}{\partial v_1},...,\frac {\partial f}{\partial v_n}\right)$, in other words it is the gradient of $f$. In this case, if you expand the dot product notation in terms of the coefficients, you obtain $\frac {\partial f}{\partial v_i}=u_i$, so $\frac {\partial f}{\partial \textbf v} = \textbf u$.
In general, most rules for taking derivatives generalise well to taking derivatives with respect to vectors, as is done here, or even matrices. For a useful reference, I recommend the matrix cookbook, which has a list of identities. Proving a few might help you in your understanding.