I'm about to take a bunch of stat and probability for the following year and so i've been refreshing my knowledge...Whilst working through DeGroot and Schervish Probability and Statistics Chapter 7 - Example 7.1.1 Lifetimes of Electronic Components (below) (apologies that theyre not more line, theyre on two seperate pages).
I'm pretty good up to
Suppose that, before observing the data, the company believes that the failure rate is probably around 0.5/year but there is quite a bit of uncertaintity about it. They model $\theta$ as a random variable having the gamma distribution with paramters 1 and 2. To rephrase....
From this section onwards i'm slightly confused by their notation. I understand that the company has decided on $\theta = 0.5/year$ but they later talk about the random variable being conditioned on itself.. ie it seems like theyre saying that for $(X_i|\theta) \sim \exp{(\theta)}$ but a more litteral reading suggest theyre saying $X_i \sim \exp{(\theta | \theta)}$ is this a reasonable interpretation of DeGroot and Schervish?
My initial thought was they were considering an estimator and so the assumption being made is that these Random variables are iid exponential distributions with parameter $\theta$ if given the estimator. Since we know what the estimator is, or infact can just find the estimator by finding the mean of the random sample and this would then make sense with the later part as, yes we can never know what $\theta$ is but we can condition on the previously observed mean...
I've been at this for a while now and in all honesty $X_i \sim \exp{(\theta | \theta)}$ seems somewhat nonsensical to me. so clarification would be greatly appreciated.


The model is hierarchical: $$\Theta \sim \operatorname{Gamma}(a = 1, b = 2), \\ f_\Theta(\theta) = \frac{b^a \theta^{a-1} e^{-b\theta}}{\Gamma(a)} = 2e^{-2\theta}, \quad \theta > 0,$$ and $$X \mid \Theta \sim \operatorname{Exponential}(\Theta), \\ f_{X\mid\Theta}(x \mid \theta) = \theta e^{-\theta x}, \quad x > 0.$$ Only the observed lifetime random variable $X$ is conditioned on $\theta$. In turn, $\theta$ is, under a Bayesian framework, a random variable whose prior distribution is Gamma with shape $a = 1$ and rate $b = 2$ hyperparameters (hence the prior mean for $\theta$ is $a/b = 1/2$). For a single random lifetime $X$, the conditional distribution given $\theta$ is exponential with rate $\theta$. But there is an unconditional distribution for the lifetime $X$, which can be thought of as a probability-weighted lifetime random variable where the weight is the Gamma density. Just like we have in the fully discrete case the law of total probability $$\Pr[X = x] = \sum_{\theta \in \Omega} \Pr[X = x \mid \Theta = \theta] \Pr[\Theta = \theta],$$ where $\Omega$ is the support of a discrete-valued random variable $\Theta$, we have in the continuous case $$f_X(x) = \int_{\theta \in \Omega} f_{X \mid \Theta}(x \mid \theta) f_\Theta(\theta) \, d\theta.$$ Specifically, $$f_X(x) = \int_{\theta = 0}^\infty \theta e^{-\theta x} \cdot 2e^{-2\theta} \, d\theta = \frac{2}{(2+x)^2}, \quad x > 0.$$ This happens to be a Pareto distribution.
In the more general case where there are multiple lifetimes (one for each component), then $X_1, X_2, \ldots, X_n$ comprise a sample from which we may estimate $\theta$. In particular, it is natural to choose $$\hat \theta = \frac{n}{\sum_{i=1}^n X_i},$$ but it is not obvious what the distribution of this estimator looks like. I leave this as an exercise.