How can I understand a theorem about the composition of two functions:
$g = \gamma: I \subset \mathbb{R} \rightarrow \mathbb{R}^m$
$f: \mathbb{R}^m \rightarrow \mathbb{R}$
if it looks like this:
$d(f \circ \gamma)(t_0) = df(\gamma(t_0))d\gamma(t_0)=\langle\nabla f(\gamma(t_0)), \gamma^{\prime}(t_0)\rangle$
Isn't $d(f \circ \gamma)(t_0)$ just another way to write $df(\gamma(t_0))$? How can I understand this notation?
No. For a function $f:\mathbb R^n \to \mathbb R^m$, the differential takes in two inputs $$df:\mathbb R^n \times \mathbb R^n \to \mathbb R^m$$ It's linear in the second, but not normally linear in the first.
$$d(f\circ \gamma)(t_0) $$ Is the differential of $f\circ \gamma$ evaluated at $t_0$. One argument remains unevaluated, the result is a linear map $\mathbb R\to\mathbb R^m$.
$$df(\gamma(t_0))$$ Is the differential of $f$ evaluated at $\gamma(t_0)$. Again this is a linear map. But it takes in a vector. This vector would be the result of $d\gamma(t_0)$ applied to a real number, but this is not yet given. The final result $d(f\circ \gamma)(t_0) d\gamma(t_0) $ is a linear map $\mathbb R \to \mathbb R^m$.
The final equality is by a basic version of Riesz Representation - the linear map is the same as matrix multiplication with a certain row vector, and that vector is written out for you there.
PS in dimension one, $df(x)(h)$ is the multiplication of real numbers $f'(x)h. $ Here it's clear that the two terms you are asking about is different, and the stated equality is a generalisation of chain rule from highschool.