Here is the one possible definition of the universal hash family (from my professor's book):
A family $\{h_a | a \in A\}$ of functions $X \to Y$ is an r-universal hash family if, and only if, for every r-tupel $(x_1,...,x_r) \in X^r$ with all $x_j$ distinct, and every r-tuple $(y_1,...,y_r) \in Y^r$, it holds that:
$$P_{a \in A}(h_a(x_1)=y_1 \wedge...h_a(x_r)=y_r) ~=~ 1/|Y|^r$$
Generally, I understand the definition yet the one thing I found rather unclear. As we draw $a$ uniformly at random from $A$ and it should be so for every value we are hashing why the definition utilizes the same $a$ for hashing all the $r$ different inputs?
What am I missing here?
The most universal family one can imagine is of size $|Y|^n$ with $n=|X|$, namely the family $$ \{h_{(y_1,...,y_n)}\mid h_{(y_1,...,y_n)}(x_i)=y_i, (y_1,...,y_n)\in Y^n\} $$ This family is $n$-universal. Choosing a hash function uniformly at random from this family essentially means choosing an $n$-tuple $(y_1,...,y_n)$ uniformly at random from $Y^n$. Thus any $n$-tuple output is equally likely.
At the other end of the scale we have the $1$-universal family I suggested in the comment section, namely $$ \{h_y\mid h_y(x)=y, y\in Y\} $$ where a single output from a randomly chosen function will appear random, but not more than that.
For $1<r<n$ we have $r$-universal families that make $r$ outputs from the same uniformly randomly chosen hash function appear uniformly random. It should appear as if $h_a$ outputs an $r$-tuple in $Y^r$ chosen uniformly at random.