Here is problem 6 from chapter 2 of Introduction to Probability by Bertsekas and Tsitsiklis:
The Celtics and the Lakers are set to play a playoff series of $n$ basketball games, where $n$ is odd. The Celtics have a probability $p$ of winning any one game, independent of other games. For any positive integer $k$, find the values for $p$ for which $n = 2k + 1$ is better for the Celtics than $n = 2k-1$.
When I read this problem statement, I quickly felt certain based on intuition that a longer series is better when $p > 1/2$. Question: Is there a short proof that allows us to see this result at a glance?
Here is a solution to the problem which seems overly complicated, given how obvious the result is intuitively. The calculation below is surely not what my brain did in order to be certain that the answer must be $p > 1/2$.
Imagine that the two teams play $2k + 1$ games, and let the random variable $N$ be the number of games won by the Celtics during the first $2k -1$ games. The probability $p_{2k+1}$ of the Celtics winning the "best of $2k+1$" series (which requires winning at least $k + 1$ games in the series) is $$ \tag{1}p_{2k+1} = P(N \geq k+1) + P(N = k)(1 - (1-p)^2) + P(N = k-1)p^2. $$ On the other hand, the probability $p_{2k-1}$ of the Celtics winning a "best of $2k - 1$" series is $$ \tag{2} p_{2k-1} = P(N \geq k + 1) + P(N=k). $$ Notice that $P(N=k) = \binom{2k-1}{k}p^k(1-p)^{k-1}$ and $$ P(N = k-1) = \binom{2k-1}{k-1}p^{k-1}(1-p)^k = \binom{2k-1}{k}p^{k-1}(1-p)^k. $$ Comparing $(1)$ and $(2)$, we see that \begin{align} p_{2k+1} > p_{2k-1} &\iff P(N=k-1)p^2 > P(N=k)(1-p)^2 \\ &\iff p^{k+1}(1-p)^k > p^k(1-p)^{k+1} \\ &\iff p > \frac12. \end{align}
If there is no simpler proof, then why are we so certain at the outset of what the answer must be?
It's the "central limit theorem". If a single event has any probability distribution with finite mean, $\mu$, and finite standard distribution, $\sigma$, then the average, over n trials, has approximately a normal distribution with mean $\mu$ and standard distribution $\frac{\sigma}{\sqrt{n}}$. The larger n is, the smaller the standard distribution is. That means that the larger n is the less variation from the mean there is likely to be.