These course notes give the following identity as (8) under subsection 1.1.3 with the following explanation(with a typo; the corrected form is from generatingfunctionology identity 1.31):
Sometimes a problem that does not seem to be a multi-index problem can be reformulated to be of the correct form. Consider, for instance, the problem of finding the generating function for binomial coefficients $\binom{n}{k}$ for fixed k as a function of n. The trick is to think of each set as being listed in order, from 1 to n. Take a set B with 2k + 1 points. There is a bijection between multi-indices on B and subsets with k elements. The multi-index corresponding to a subset counts the number of elements between each of the elements of the subset and also counts each element.
$$\sum _{n=0}^\infty \binom{n}{k}y^n = \frac{y^k}{(1-y)^{k+1}}$$
I couldn't follow the reasoning in that paragraph; could I get a detailed, step-by-step explanation or a different combinatorial argument for this identity?
We utilize a toy version of combinatorial class.
Definition. For each countable set of words $\mathcal{A}$, we define its generating function $\mathcal{A}(x)$ by $$ \mathcal{A}(z) = \sum_{a \in \mathcal{A}} z^{|a|}, $$ where $|a|$ is the length of the word $a$.
Observation. Let $\mathcal{A}$ and $\mathcal{B}$ be two countable sets of words, and assume the following property holds:
By collecting all possible concatenations, we form a new set of words $\mathcal{A}\cdot\mathcal{B}$: $$ \mathcal{A} \cdot \mathcal{B} = \{ ab : a \in \mathcal{A}, b \in \mathcal{B} \} $$ Under the assumption, it follows that $$ (\mathcal{A}\cdot\mathcal{B})(z) = \sum_{a\in\mathcal{A}, b\in\mathcal{B}} z^{|ab|} = \sum_{a\in\mathcal{A}}\sum_{b\in\mathcal{B}} z^{|a|+|b|} = \mathcal{A}(z)\mathcal{B}(z). $$
Obviously this observation generalizes to an arbitrary finite number of concatenations under the uniqueness assumption extending the above property.
Now let $\mathcal{C}$ be the set of all words of alphabet $\{\mathtt{a}, \mathtt{b}\}$ containing exactly $k$ letters $\mathtt{b}$. We compute $\mathcal{C}(z)$ in two ways and establish OP's identity.
1. Each word of length $n$ in $\mathcal{C}$ has exactly $n-k$ $\mathtt{a}$'s and $k$ $\mathtt{b}$'s. The number of all such words of length $n$ is precisely $\binom{n}{k}$, and so,
$$ \mathcal{C}(z) = \sum_{n=0}^{\infty} |\{ c \in \mathcal{C} : |c| = n \}| z^n = \sum_{n=0}^{\infty} \binom{n}{k} z^n. $$
2. On the other hand, it is easy to see that each word in $\mathcal{C}$ uniquely factors into the concatenation of $k+1$ sequences of $\mathtt{a}$'s and $k$ letters $\mathtt{b}$'s. For example,
$$ \mathtt{aaababbaa} = \mathtt{a}^3\mathtt{b}\mathtt{a}^1\mathtt{b}\mathtt{a}^0\mathtt{b}\mathtt{a}^2. $$
To make use of this observation, we write $\mathcal{A} = \{ \mathtt{a}^i : i \geq 0\}$ for the set of all words of alphabet $\{\mathtt{a}\}$, and let $\mathcal{B} = \{ \mathtt{b} \}$. Then
\begin{align*} \mathcal{C} = \{ \mathtt{a}^{i_0}\mathtt{b}\mathtt{a}^{i_1}\mathtt{b}\cdots\mathtt{b}\mathtt{a}^{i_k} : i_1, \ldots, i_k \geq 0 \} = \underbrace{\mathcal{A} \cdot \mathcal{B} \cdot \mathcal{A} \cdot \mathcal{B} \cdots \mathcal{B} \cdot \mathcal{A}}_{\text{total $k$ $\mathcal{B}$'s}}. \end{align*}
Hence, the corresponding generating functions satisfy the relation
$$ \mathcal{C}(z) = \underbrace{ \mathcal{A}(z) \mathcal{B}(z) \mathcal{A}(z) \mathcal{B}(z) \cdots \mathcal{B}(z) \mathcal{A}(z) }_{\text{total $k$ $\mathcal{B}(z)$'s}} = \mathcal{A}(z)^{k+1}\mathcal{B}(z)^k. $$
Now the desired conclusion follows by noting that $\mathcal{A}(z) = \sum_{i\geq 0} z^i = \frac{1}{1-z}$ and $\mathcal{B}(z) = z$.