When does a sufficient statistic not exist by the Factorization Theorem?

635 Views Asked by At

The Neyman Factorization Theorem states the following: Let $f(x_1, ..., x_n; \theta)$ denote the joint pmf or pdf of $X_1, ..., X_n$. Then $T = t(x_1, ..., x_n)$ is a sufficient statistic for $\theta$ if and only if the joint pdf or pmf $f$ can be represented as a product of two factors in which the first factor depends only on $\theta$ and the data $t(x_1, ..., x_n$), and the second factor depends o $x_1, ..., x_n$, but does not depend on $\theta$:

$$f(x_1, ..., x_n) = g(t(x_1, ..., x_n); \theta) \cdot h(x_1, ..., x_n)$$

Now, as is consistent with how function notation usually works, I've seen examples where $h$ is constant. But then in that case, wouldn't $g = f$ and $h = 1$ always be a valid factorization? In that case, wouldn't that mean that we always have a sufficient statistic?

If not, what is an example where this doesn't work?

1

There are 1 best solutions below

0
On

A sufficient statistic always exists in the parametric setup $X_i\sim F_{\theta}$ where the $X_i$'s are i.i.d ($F$ being the common distribution function and $\theta\in \Omega\subseteq \mathbb R^d$ for some $d$), because the sample $(X_1,\ldots,X_n)$ itself is trivially sufficient for the unknown quantity $\theta$. When $F$ is continuous, the vector of order statistics $(X_{(1)},\ldots,X_{(n)})$ is also sufficient for $\theta$.

Consider two common examples:

  • $X_1,\ldots,X_n$ are i.i.d Cauchy with unknown location $\theta$, i.e. joint pdf is

$$f_{\theta}(x_1,\ldots,x_n)=\prod_{i=1}^n\frac1{\pi(1+(x_i-\theta)^2)}=\frac1{\pi^n\prod_{i=1}^n(1+(x_i-\theta)^2)}\,,$$

where $(x_1,\ldots,x_n)\in\mathbb R^n$ and $\theta\in\mathbb R$.

  • $X_1,\ldots,X_n$ are i.i.d having a Laplace distribution with unknown location $\theta$, i.e. with joint pdf

$$f_{\theta}(x_1,\ldots.x_n)=\prod_{i=1}^n \frac12e^{-|x_i-\theta|}=\frac1{2^n}\exp\left[-\sum_{i=1}^n|x_i-\theta|\right]\,,$$

with $(x_1,\ldots,x_n)\in\mathbb R^n$ and $\theta\in\mathbb R$.

In both cases there is no non-trivial sufficient statistic for $\theta$.

So you can take $g=f$ and $h=1$ in the Factorization theorem but that does not guarantee you will get a non-trivial sufficient statistic $t$, i.e. it might not be possible to separate a $t(x_1,\ldots,x_n)$ from the joint pdf/pmf so that it is free of $\theta$.