Given a random sample $X_1, X_2,..., X_n$ from Bernoulli distribution. The log-likelihood function is:
$\mathcal{L}(\theta) = \sum_1^n x_i^*\log{\theta} + (n - \sum_1^n x_i^*)\log{(1-\theta)}$
Score function:
$\frac{\partial \mathcal{L}}{\partial \theta} = \frac{\sum_1^n x_i^*}{\theta} - \frac{n - \sum_1^n x_i^*}{1 - \theta} \; \forall \theta \in (0,1)$
By solving the score equation $\frac{\partial \mathcal{L}}{\partial x} = 0 $ we get $\theta = \bar{x}$ is a potential maximum likelihood estimate for $\theta$.
We need to verify that the second derivative is negative at $\theta = \bar{x}$.
$\frac{\partial^2 \mathcal{L}}{\partial^2 \theta} = -\frac{\sum_1^n x_i^*}{\theta^2} - \frac{n - \sum_1^n x_i^*}{(1 - \theta)^2} \; \forall \theta \in (0,1)$
$\frac{\partial^2 \mathcal{L}}{\partial^2 \theta}\rvert_{\theta = \bar{x}} = -\frac{\sum_1^n x_i^*}{\bar{x}^2} - \frac{n - \sum_1^n x_i^*}{(1 - \bar{x})^2}$
At this step, some people will conclude immediately that the second derivative evaluated at maximum likelihood estimate is negative, so $\bar{X}$ is maximum likelihood estimator for $\theta$.
However, do I have to say that the second derivative is negative under the condition that $\bar{x} \neq 0$ and $ 1 - \bar{x} \neq 0$? Or is this constraint implied from the beginning? I just want everything to be completely precise.
I know that if $\bar{x} = n$ or $\bar{x} = 0$, the likelihood function will behave differently as it will have a reflection point, but the maximum likelihood estimate for these situations is still $\bar{x}$. Of course, we cannot use the method shown above to derive the maximum likelihood estimate.
I disagree that they would not exist. The second derivative is only a tool to find the global maximum of the likelihood function. We have the likelihood function
$$L(\theta)=\sum x_i \log \theta+(1-\sum x_i)\log(1-\theta),\theta\in[0,1]$$
If all the observations are successes, we get the likelihood
$$L(\theta)= n\log \theta$$
Since the function is increasing in $\theta$, we get $\hat\theta=1$. Similarly if all the observations are failures, we get the likelihood
$$L(\theta)=\log(1-\theta)$$
this is now decreasing in $\theta$ so we want the minimum possible $\theta$, which in the interval $[0,1]$ is $\hat\theta=0$. The moral of the story is that the second derivative can be used for the intermediary values but not the end points.