I have the following issue (I want to predict sports results). Let $X$ be a discrete RV with $m$ possible outcomes, each having probability $p_i$, for $i=1,\ldots,m$. Assume that I have a large iid sample $X_1,\ldots,X_n$, where each variable has the same distribution as $X$. Now, to make a prediction for each $X_k$ (before its actual value is drawn) I could just always predict outcome $i^*$ with largest probability $p^*_i$. Then, in expectation, I would be right on $n\cdot p^*_i$ samples.
Alternatively, I could sample from the distribution of $X$, that is, for each $X_k$ predict value $i$ according to $p_i$. Then, the probability that my prediction for $X_k$ equals the actual draw of $X_k$ is
$$q=P[\hat{X}_k=X_k]=\sum_{i=1}^m P[\hat{X}_k=i]P[X_k=i]=\sum_{i=1}^m p_i^2.$$
Thus, according to this approach, the probability that I am correct on $r$ predictions is $B(r;n,q)=\binom{n}{r}q^r(1-q)^{n-r}$. Now, what is the probability that this approach gives me at least $n\cdot p_i^*$ correct results (i.e., $B(n\cdot p_i^*;n,q)+\cdots+B(n;n,q)$)? Are there any known general results on this? How does the answer depend on the distribution of $X$ (e.g. uniform, skewed, ...)?
In other words, should I sample (approach 2) or just predict the most probable outcome (approach 1)? (Did I make any mistakes somewhere?)
$\sum_{i=1}^m p_i^2 \le \sum_{i=1}^m p_ip_{max} = p_{max}\sum_{i=1}^m p_i = p_{max}.$