Let's say a coin produces $H$ (for heads) with probability $p$ (and thus it produces $T$ with probability $p-1$. Let $X$ denote the number of $H$s we see and let $h$ be an integer. Then, if we flip the coin $n$ times:
$P(X=h) = \binom{n}{h}p^h(1-p)^{n-h}$
I'm not quite understanding where this formula is coming from.
I can see that there are $\binom{n}{h}$ ways to choose a $h$-length subset of an $n$-length set (i.e. there are $\binom{n}{h}$ $n$-length sequences where we see $h$ heads), but why do we multiply that with the probability? Furthermore, why do we multiply the probabilities of heads and tails together?
I get that coin flipping is an independent repeated trial, but I'm not sure how to apply that here (or whether it's even relevant). Any help is appreciated!
Say, rather, $\tbinom n h$ is the count of ways to select $h$ elements from a set of $n$ distinct elements. For example, the ways to select $h$ positions from a sequence of $n$ trials.
Thus $\tbinom nh$ is the count of ways to arrange $h$ heads among $n$ trial results.
$p$ is the probability that a trial will result in heads. $p^h$ is the probability that $h$ independent trials will do so. $(1-p)^{n-h}$ is the probability that $n-h$ independent trials will result in tails. $p^h(1-p)^{n-h}$ is therefore the probability that the first $h$ independent trials will result in heads and the next $n-h$ independent trials will result in tails.
However that is for a particular arrangement of $h$ heads and $n-h$ tails. We want the probability for any arrangement of that many heads and tails among $n$ trials. So we must multiply by the count of such arrangements (since all arrangements are equally probable).$$\mathsf P(H{=}h) ~=~\dbinom n h p^h(1-p)^{(n-h)}\qquad\big[h\in\Bbb N, 0\leq h\leq n\big]$$