Beta Distribution Sufficient Statistic

14.8k Views Asked by At

So I have this homework problem that I am struggling a little bit with coming to a solid answer on. The problem goes like this:

Suppose X~Beta($\theta,\theta), (\theta>0)$, and let $\{X_1, X_2 , \ldots , X_n \}$ be a sample. Show that T=$\Pi_i(X_i*(1-X_i)$ is a sufficient statistic for $\theta$.

I started out with my Beta distribution as:

$f(x_i,\theta)=\frac{\Gamma(\alpha + \beta)}{\Gamma(\alpha)\Gamma(\beta)}x^{(\alpha-1)}(1-x)^{(\beta-1)}$

$=\frac{\Gamma(\theta + \theta)}{\Gamma(\theta)\Gamma(\theta)}x_1^{(\theta-1)}(1-x_1)^{(\theta-1)} ***\frac{\Gamma(\theta + \theta)}{\Gamma(\theta)\Gamma(\theta)}x_n^{(\theta-1)}(1-x_n)^{(\theta-1)} $

$=\frac{\Gamma(2\theta)}{\Gamma(\theta)^2}x_1^{(\theta-1)}(1-x_1)^{(\theta-1)} ***\frac{\Gamma(2\theta)}{\Gamma(\theta)^2}x_n^{(\theta-1)}(1-x_n)^{(\theta-1)}$

$={(\frac{\Gamma(2\theta)}{\Gamma(\theta)^2})}^n \Pi_i (x_i)(1-x_i)^{(\theta-1)}$

I know that in order for my statistic to be sufficient by factorization, I need to have a $g(T,\theta)$ and a $h(x_1,x_2,...,x_n)$.

What I have above is my $g(T,\theta)$, but I am not so sure about my $h(x_1,x_2,...,x_n)$. I have seen other places where the suggestion is to use 1 for my $h(x_1,x_2,...,x_n)$. Could I do this here with this problem? It just seems a little too easy to do that, but I will be happy if it is that easy.

If anyone could let me know, that would be greatly appreciated.

1

There are 1 best solutions below

0
On

(The above answer did not answer minimal sufficiency. So I show it here.)

Note that beta distribution with p.d.f $$\delta_{\alpha, \beta}(x) = cx^{\alpha-1}(1-x)^{\beta-1}dx$$ is a exponential distribution, because it can be rewrite as

$$\delta_{\alpha, \beta}(x) = \exp{\left[\alpha\log(x)+\beta\log(1-x)\right]}\mu(dx)$$

We know that, for example from equation 2.5 from this reference,

a minimal sufficient statistic for a distribution belongs to the full-rank exponential family is $T(X)$, where the p.d.f of exponential family is $\exp{[T(X)'\eta(\theta)]c(\eta(\theta))}\mu(dx)$.

Thus we conclude that $$\left(\sum_{i=1}^n\log(x_i),\sum_{i=1}^n\log(1-x_i)\right)$$ is a minimal sufficient statistic for $(\alpha, \beta)$.

When $\alpha=\beta$, the distribution with p.d.f is still from a full-rank exponential family because the p.d.f can be written as $$\delta_{\alpha}(x) = \exp{\left\{\alpha[\log(x)+\log(1-x)]\right\}}\mu(dx)$$ This means $$\sum_{i=1}^n\left[\log (x_i), \log(1-x_i)\right]$$ is a minimal sufficient statistic when $\alpha=\beta$.