Trouble finding an estimator from a discrete RV

43 Views Asked by At

Okay, so I am trying to find unbiased and consistent estimators of parameter $a$ from sequence of RVs that represent unfair dice rolls: it rolls 1 with probability of $1+a$, 6 with probability of $1-a$ and all the other with $\frac{1}{6}$.

Here's where I'm confused: I know that one of the ways is to find a Maximum Likelikhood estimator, which requires me to find probability mass functions for each RV, then we multiply it, log it, yada, yada yada, apply calculus and we get it. Except, how to put this RV into one nice equation, easy to multiply? All the examples I find on-line show conveniently Bernoulli's distribution which has fine and dandy PMF. How to even tackle thing like that?

2

There are 2 best solutions below

0
On

I assume you meant $(1 + \alpha)/6$ and $(1 - \alpha)/6$, where $0 < \alpha < 1,$ for for the respective probabilities of faces 1 and 6. Also that we are rolling the die $n$ times.

Intuitively, and immediately obvious from the likelihood, one should ignore counts for outcomes other than 1 and 6 as irrelevant. Let $X_1$ and $X_6$ be the respective counts of 1 and 6 in $n$ rolls.

Upon finding the derivative of the log-likelihood function, etc., it seems the estimator is $(X_1 - X_2)/(X_1 + X_2).$

I'll leave the details of that, and the discussion of unbiasedness and consistency, to you.

I tried simulations with 10 million rolls, for $\alpha = .1$ and $.3$, and got three place accuracy.

0
On

The formal probability distribution for the outcome of a single roll is $$\Pr[X = k] = \begin{cases} \frac{1+a}{6}, & k = 1 \\ \frac{1}{6}, & k \in \{2, 3, 4, 5 \} \\ \frac{1-a}{6}, & k = 6, \end{cases}$$ for $-1 \le a \le 1.$ It is not necessary to write this as a single function. The joint distribution of a sample $\boldsymbol x = (x_1, \ldots, x_n)$ given the parameter $a$, is simply $$f(\boldsymbol x \mid a) = \frac{(1+a)^{y_1} (1-a)^{y_2}}{6^n},$$ where $$y_1 = \sum_{i=1}^n \mathbb{1}(x_i = 1)$$ is the number $x_i$s that equal $1$, and similarly, $$y_2 = \sum_{i=1}^n \mathbb{1}(x_i = 6)$$ is the number of $x_i$s that equal $6$. This suggests to us that we should perform the computation on the sufficient statistic $T(\boldsymbol x) = (y_1, y_2)$ for $a$, rather than the sample $\boldsymbol x$. The log-likelihood is proportional to $$\ell(a \mid y_1, y_2) = y_1 \log (1+a) + y_2 \log (1-a),$$ and since $|a| \le 1$, we find the critical point(s) satisfying $$0 = \frac{\partial \ell}{\partial a} = \frac{y}{1+a} - \frac{y_2}{1-a},$$ or $$\hat a = \frac{y_1 - y_2}{y_1 + y_2}.$$ Note we need to be a bit careful about the log-likelihood when $|a| = 1$, but the estimator works out okay in the end, since in such a case, you would never simultaneously observe $y_1 > 0$ and $y_2 > 0$. It is not difficult to show that this choice yields a maximum likelihood $L(\hat a \mid y_1, y_2)$.