Lets say we have a discrete distribution with following probabilities:
$P(X=0)=\frac{1}{3}\theta, P(X=1)=\frac{2}{3}\theta, P(X=2)=\frac{2}{3}(1-\theta), P(X=3)=\frac{1}{3}(1-\theta)$
Estimating $\theta$ using maximum likelihood function is quite easy: for sample $(3,0,2,1,3,2,1,0,2,1)$ it is $\hat{\theta}_{MLE}=0.5$.
But how can we calculate (estimate) variance of $\hat{\theta}_{MLE}$? I don't see any straightforward way to do that using Fisher Information or just definition of variance.
Assume one observes $n_i$ times the result $i$ in a sample of size $n=n_0+n_1+n_2+n_3$, then the likelihood of the sample is $$ \ell(\theta)=\theta^{n_0}(2\theta)^{n_1}(2(1-\theta))^{n_2}(1-\theta)^{n_3}3^{-n}, $$ hence $$ \frac{\partial\log\ell(\theta)}{\partial\theta}=\frac{n_0+n_1}{\theta}-\frac{n_2+n_3}{1-\theta}, $$ which implies that the MLE estimator $\widehat\theta$ of the parameter $\theta$ based on this sample is $$ \widehat\theta=\frac{n_0+n_1}n. $$ Note that $n_0+n_1$ is the number of successes in $n$ i.i.d. trials when the probability of success of one trial is $ P[X=0]+P[X=1]=\theta. $ Thus, $n\widehat\theta$ is binomial $(n,\theta)$, hence $$ n^2\mathrm{var}(\widehat\theta)=n\theta(1-\theta), $$ that is, $$ \mathrm{var}(\widehat\theta)=\frac{\theta(1-\theta)}n. $$