definition of a sufficient statistic

643 Views Asked by At

The "normal" definition of a sufficient statistics is via independence of the pdf (conditional on the statistic) of the parameter $\theta$. The Fisher-Neyman theorem gives a nice characterization:

The statistic $T$ is sufficient iff $f(x;\theta) = h(x)g(T(x);\theta)$ for two nonnegative functions $h, g$.

In the book All of statistics by Larry Wasserman the author defines (I quote):

"Write $\vec{x} \iff \vec{y}$ if $f(\vec{x};\theta) = cf(\vec{y};\theta)$ for some constant $c$ that might depend on $\vec{x}$ and $\vec{y}$ but not on $\theta$. A statistic $T(\vec{x})$ is sufficient if $T(\vec{x})\iff T(\vec{y})$ implies that $\vec{x}\iff \vec{y}$

where $\vec{x} = (x_1, \dots, x_n)$ a data sample. The $T(\vec{x})\iff T(\vec{y})$ seems to be like the Fisher-Neyman thoerem with different data samples. How can one show that Wassermans definition is the same as above (or equivalent to the Fisher-Neyman).

1

There are 1 best solutions below

2
On BEST ANSWER

As you probably know, for a random sample $\vec{X}=(X_{1}, X_{2}, \ldots, X_{n})$ with joint pdf $f(\vec{x}; \theta)$, a statistic $T = T(\vec{X})$ is sufficient if the conditional pdf for $\vec{X}$ given $T$ is ``$\theta$-free". This is the definition of sufficiency. I think it is easier to

(a) believe or show that the Fisher-Neyman Theorem completely characterizes a sufficient statistic

and then

(b) show that Wasserman's definition completely characterizes a sufficient statistic.

Then you will have equivalence of both characterizations.

Suppose that $T=T(\vec{X})$ is a statistic and let $t=T(\vec{x})$ be a fixed value of the statistic. By definition, $T$ is sufficient if $f_{\vec{X}|T}(\vec{x}|t;\theta)$ which is $$ \frac{f_{\vec{X},T}(\vec{x},t;\theta)}{f_{T}(t;\theta)} $$ does not depend on $\theta$.

Now if the argument $t$ is really $t=T(\vec{x})$, then having the joint distribution on in that numerator is redundant and unnecessary, whereas if $t \ne T(\vec{x})$, the numerator is zero.

(To sort of see this, consider the oversimplified discrete example for a random sample $X_{1}, X_{2}$ of size $2$ and the statistic $T=X_{1}+X_{2}$. Then $P(X_{1}=2, X_{2}=4, X_{1}+X_{2}=6) = P(X_{1}=2, X_{2}=4)$ and $P(X_{1}=2, X_{2}=4, X_{1}+X_{2}=22)=0$.)

If $t \ne T(\vec{x})$ and that numerator is $0$, the ratio definitely doesn't depend on $\theta$ so all is good. In the case that $t = T(\vec{x})$ we can write $$ f_{\vec{X}|T}(\vec{x}|t;\theta) = \frac{f_{\vec{X},T}(\vec{x},t;\theta)}{f_{T}(t;\theta)} = \frac{f_{\vec{X}}(\vec{x},\theta)}{f_{T}(t;\theta)} \stackrel{notation}{=} \frac{f(\vec{x},\theta)}{f_{T}(t;\theta)}. $$

``$\theta$-free" means that this is equal to some function that depends on $\vec{x}$ only, say $$ \frac{f(\vec{x},\theta)}{f_{T}(t;\theta)} = h(\vec{x}). $$ In other words, $$ f(\vec{x};\theta) = h(\vec{x}) \, f_{T}(t;\theta) = h(\vec{x}) \, f_{T}(T(\vec{x});\theta). $$

That was all by definition of sufficiency but you can almost see the Fisher-Neyman Theorem right there. (Certainly if this holds, we can choose the function $g$ in the F-N Theorem to be the pdf for $T$. The other direction of showing the F-N Theorem is a little bit more work.)

Now for the Wasserman definition...

If $T$ is sufficient, by definition we can write $$ f(\vec{x};\theta) = h(\vec{x}) \, f_{T}(t;\theta) = h(\vec{x}) \, f_{T}(T(\vec{x});\theta). $$ If $T(\vec{x}) \Leftrightarrow T(\vec{y})$ then $f(T(\vec{x});\theta) = c \, f(T(\vec{y});\theta)$ for some constant $c$ that may depend on $\vec{x}$ and/or $\vec{y}$ but not on $\theta$. So, we have $$ f(\vec{x};\theta) = h(\vec{x}) \, f_{T}(T(\vec{x});\theta) = h(\vec{x}) \cdot c\, f_{T}(T(\vec{y});\theta), $$ but then $f(\vec{y};\theta) = h(\vec{y}) \, f_{T}(T(\vec{y});\theta)$ so we have that $$ f(\vec{x};\theta) = \underbrace{c \frac{h(\vec{x})}{h(\vec{y})}}_{\mbox{new c}} f(\vec{y};\theta) $$ and thus $\vec{x} \Leftrightarrow \vec{y}$.

It reamins to show that, if $T(\vec{x}) \Leftrightarrow T(\vec{y})$ implies $\vec{x} \Leftrightarrow \vec{y}$, then $f(\vec{x},\theta)/f_{T}(t;\theta)$ is $\theta$-free... but I'm about to miss my last bus home for the evening!