I am trying to conduct a justification of the MLE on Bernoulli distribution.
\begin{align} P(D\mid \theta) &= \binom {n_H + n_T}{n_H} \theta^{n_H} (1 - \theta)^{n_T}, \end{align}
plug in the definition and compute the log-likelihood:
\begin{align} \hat{\theta}_{MLE} &= \operatorname*{argmax}_{\theta} \,P(D; \theta) \\ &= \operatorname*{argmax}_{\theta} \binom {n_H + n_T}{n_H} \theta^{n_H} (1 - \theta)^{n_T} \\ &= \operatorname*{argmax}_{\theta} \,\log\binom {n_H + n_T}{n_H} + n_H \cdot \log(\theta) + n_T \cdot \log(1 - \theta) \end{align}
How do I compute $\log\binom {n_H + n_T}{n_H}$?
I don't think you need to care about what $\log \binom{n_H+n_T}{n_H}$ is since to find argmax, you need to differentiate with respect to , and $\log \binom{n_H+n_T}{n_H}$ vanishes after differentiation.