Checking whether the given statistic is sufficient

513 Views Asked by At

A random sample is drawn from a Bernoulli distribution with $X_i = 1 $ with unknown probability $p$ and zero otherwise. Examine whether the following statistics are sufficient for the parameter $p$ ?

  1. $(X_1,...,X_N)$
  2. $(X_1^2,[X_2+...+X_n]^2)$

My doubt: How are the above two a statistic? I am under the impression that a statistic is a function of sample i.e. a statistic $T=f(X_1,...,X_N)$. For example $Y=(X_1+...+X_n)/n$ . I can easily check whether Y is sufficient with respect to $p$ or not.

So how are the above two a statistic? And how to check their sufficiency?

2

There are 2 best solutions below

0
On

A statistic is a function of the sample that does not depend on any unknown parameters of the distribution(s) from which the sample was drawn. This definition does not imply that such a function needs to be a scalar-valued function. It may (and often is) a vector-valued function.

Therefore, the original sample $\boldsymbol X = (X_1, \ldots, X_n)$ is a vector-valued identity function of the original sample: $\boldsymbol T = \boldsymbol f(\boldsymbol X) = \boldsymbol X$. And as such, it is tautologically a sufficient statistic, because it contains as much information about the parameter(s) that is present in the original sample.

The less trivial question is the second one: is the vector-valued statistic $\boldsymbol T = (X_1^2, (X_2 + \cdots + X_n)^2)$ a sufficient statistic? That is to say, does this ordered pair retain as much information as we can get about the parameter $p$ as the original sample? The answer to this question requires some actual mathematics; e.g., the Factorization Theorem.

But here's a hint: if you can show that $\bar X = (X_1 + \cdots + X_n)/n$ is a sufficient statistic for $p$, then if you can also show that you can express $\bar X$ as a function of $\boldsymbol T$, then $\boldsymbol T$ is also sufficient for $p$, since given $\boldsymbol T$, you can compute $\bar X$.

This then relates to the concept of a minimal sufficient statistic: if $\bar X$ is sufficient and $\boldsymbol T$ is also sufficient for $p$, clearly $\bar X$ achieves a greater degree of data reduction than $\boldsymbol T$, which in turn achieves a greater degree of data reduction for $n > 2$ than the sample itself, $\boldsymbol X$ (since the latter is not reduced at all, and $\boldsymbol T$ is only an ordered pair, and $\bar X$ is just a single scalar value). A statistic that achieves the maximum possible degree of data reduction is a minimal sufficient statistic.

1
On

If you want the mathematical proof:

Let $f(X_1,\cdots, X_n) $ be the probability mass function of the sample observations then to show that $T$ is sufficient for the parameter $\theta$ it is enough to show that the conditional distribution $f(X_1,\cdots , X_n|T) $ is independent of the parameter. Now

i)$T(X_1, \cdots , X_n) = (X_1,\cdots , X_n)$, then \begin{equation} P(X_1,\cdots, X_n=(x_1,\cdots,x_n)|T=t ) = \big(X_1,\cdots, X_n=(x_1,\cdots,x_n)| (X_1,\cdots, X_n) = (t_1,\cdots,t_n)\big) \end{equation}

is $1$ if and only if $(x_1,\dots, x_n)= (t_1,\cdots,t_n)$ and $0$ otherwise. Clearly this does not depend on $p$.

ii) $T(X_1,\cdots, X_n) = (X_1^2,[X_1+ ... + X_n]^2)$, then

$T(X_1,\cdots, X_n) =(t_1,t_2) \\ \iff X_1 = \sqrt t_1, \sum_{i=1}^n X_i = \sqrt t_2\\ \iff X_1 = \sqrt t_1, \sum_{i=2}^n X_i = \sqrt t_2 - \sqrt t_1$ \begin{align} &P(X_1,\cdots, X_n=(x_1,\cdots,x_n)|T=t ) \\ &=\frac{P\big(X_1,\cdots, X_n=(x_1,\cdots,x_n)\cap X_1 = \sqrt t_1\cap \sum_{i=2}^n X_i = \sqrt t_2 - \sqrt t_1\big)}{P\big(X_1 = \sqrt t_1\cap\sum_{i=2}^n X_i = \sqrt t_2 - \sqrt t_1\big)}\\ &=\frac{P\big(X_1 = \sqrt t_1)P(X_2\cdots, X_n=(x_2,\cdots,x_n)\cap \sum_{i=2}^n X_i = \sqrt t_2 - \sqrt t_1\big)}{P\big(X_1 = \sqrt t_1) P(\sum_{i=2}^n X_i = \sqrt t_2 - \sqrt t_1\big)}\\ & = \frac{P(X_2=x_2,\cdots, X_{n-1}=x_{n-1},X_n = \sqrt t_2 - \sqrt t_1 - \sum_{i=2}^{n-1}x_i \big)}{P(\sum_{i=2}^n X_i = \sqrt t_2 - \sqrt t_1\big)} \\ & =\frac{p^{\sum_{i=2}^{n-1}x_i} (1-p)^{\sum_{i=2}^{n-1}x_i} p^{\sqrt t_2 - \sqrt t_1 - \sum_{i=2}^{n-1}x_i} (1-p)^{1-\sqrt t_2 - \sqrt t_1 - \sum_{i=2}^{n-1}x_i}}{ {n-1 \choose \sqrt t_2 - \sqrt t_1} p^{\sqrt t_2 - \sqrt t_1} (1-p)^{n-1 - (\sqrt t_2 - \sqrt t_1)}} \end{align} Which does not depend on $p$.