The Wikipedia page for the multinomial distribution says the following:
For $n$ independent trials each of which leads to a success for exactly one of $k$ categories, with each category having a given fixed success probability, the multinomial distribution gives the probability of any particular combination of numbers of successes for the various categories.
But let's say we have a binomial point process with $n$ points over some region of disjoint Borel sets $B_1, B_2, \dots, B_n$ (I think "Borel set" is the correct terminology here, but I'm not sure). Let $Y_i = N_{\mathbf{X}}(B_i)$ be the random variable describing the number of points in the Borel set $B_i$. Then the $Y_i$ are themselves binomial random variables, right? But then the $Y_i$ are all dependent, since $Y_1 + Y_2 + \dots, Y_n \le n$. But all of these dependent binomial random variables together would be a multinomial distribution with parameters $(n, p_1, p_2, \dots, p_n)$, right? But how do we reconcile this with the fact that the multinomial distribution requires independence?
I realise that I've used very flimsy language here. I would appreciate it if someone would please tighten-up/correct my probabilistic/mathematical language in their explanation of this.
For a multinomial distribution with $n$ trials and $k$ outcomes for each trial, we may sample from each distribution as follows.
We let $X_1,\dots,X_n$ represent the outcomes of each of the $n$ trials. So each $X_i$ takes exactly one of the values $1,\dots,k$, and, $X_i= j$ with probability $p_j$. Importantly, each trial is independent of all others, in the sense that the random variables $X_1,\dots,X_n$ are independent.
Now we realise the multinomial distribution as the random vector $N=(N_1,\dots, N_k)$ by letting each $N_j$ count how many of the $X_i$ have value equal to $j$, i.e. $$N_j = \sum_{i=1}^n 1_{\{X_i=j\}}.$$
Now, even though the $X_i$ (the trials) are independent of each other, note that the $N_j$ are not! Indeed, have the condition $N_1 + \dots N_k = n$, even though the $N_i$ have binomial marginals.
So in short, the condition for independence is for the trials with which we sample the distribution, and not the counts that make up the realisation. You can double check that this agrees with you intuition in the Binomial ($k=2$) case, where we have a sequence of (independent) coin flips, and we count the number of heads and tails.