Bayesian Statistics: Finding Sufficient Statistic for Uniform Distribution

594 Views Asked by At

The example: let $y_1,\dots,y_n \overset{\text{i.i.d.}}\sim U([0,\theta])$, where $\theta >0$ is unknown. Find a sufficient statistic for $\theta$.

Solution attempt:

$$g(y_1,\dots,y_n) = c\quad \text{(constant)}$$

$$P(y_i\mid\theta) = \frac{1}{\theta}\quad \text{ for } 0<y_i<\theta$$

$$P(y_1,\dots,y_n\mid\theta) = \prod_{i=1}^n P(y_i\mid\theta) = \frac{1}{\theta^n}\quad\text{ for } 0<y_1,\dots,y_n<\theta$$

Now this is where I got stuck. I have seen this post about Sufficient Statistic but I am still stuck. Could somebody help me find a sufficient statistic for this problem? (I think maybe taking the average or the maximum value of $y_i$s might work but not sure how to do the next step)

2

There are 2 best solutions below

0
On BEST ANSWER

Convince yourself that the joint density that you've written can be expressed this way:

$$P(y_1,\ldots,y_n\mid\theta)=\begin{cases} \frac1{\theta^n}&\text{if $\max(y_1,\ldots,y_n)\le\theta$}\\ 0&\text{otherwise}\end{cases}\tag{*} $$ This means that the joint density depends on $y_1,\ldots,y_n$ only through $T(y):=\max(y_1,\ldots,y_n)$. By the factorization criterion, $T(Y_1,\ldots,Y_n) = \max(Y_1,\ldots,Y_n) =: Y_{(n)}$ is a sufficient statistic for $\theta$.

To make (*) look more like a function, follow @BruceET's suggestion and use indicator functions. Since every $y_i$ is less than $\theta$ if and only if the max of them is less than $\theta$, we have: $$ P(y_1,\ldots,y_n\mid \theta)=\frac1{\theta^n}\prod_{i=1}^nI_{[0,\theta]}(y_i) =\frac1{\theta^n}I_{[0,\theta]}(\max_iy_i) =\frac1{\theta^n}I_{[0,\theta]}(T(y)) $$

Aside: There is nothing Bayesian about this calculation.

0
On

I find it at best irritating to use the same symbol to refer both to the random variable and to the argument to the density function. We can understand such things as $\Pr(Y\le y) = (\text{a certain function of } y)$ because capital $Y$ and lower-case $y$ mean two different things.

Write the density like this and see if you can do something with that: $$ f_{Y_1,\ldots,Y_n}(y_1,\dots,y_n\mid\theta) = \prod_{i=1}^n P(y_i\mid\theta) = \begin{cases} 1/\theta^n & \text{if } \max\{y_1,\ldots,y_n\}\le\theta, \\ 0 & \text{if } \max\{y_1,\ldots,y_n\} >\theta, \end{cases} $$