Bayesian updating for normal prior when data is only observed on a subset of the support

263 Views Asked by At

Consider the random variable $A$ whose prior belief if $A\sim \mathcal{N}( \mu,\sigma_{\mu})$, where both parameters are known. Suppose that I draw a realization of the random variable $X\sim \mathcal{N}(A,\sigma_x)$, but I observe its realization $x$ if and only if $x \leq k$. The threshold $k$ is also known.

How can I do the bayesian updating for the case in which the realized $x$ is larger than $k$ and hence not observed? My attempt, which I'm not sure if it makes sense:

Consider the standard case, in which I do observe $x$. The posterior probability is:

$$ \begin{align*} \pi (A \mid \ x ) \propto \pi(x\mid A) \pi (A) \end{align*} $$ where $\pi (A)$ is the density of the parameter $A$. Now, if $x$ is realized but not observed ($x>k$), the posterior mean $\hat{A}|\{\text{not observing } x\}$ would be:

$$ \begin{align*} \hat{A}|\{\text{not observing } x\} = \dfrac{\int_{-\infty}^{+\infty}A \color{blue}{P(X>k)}\pi (A) \, \rm d A}{\int_{-\infty}^{+\infty}\color{blue}{P(X>k)}\pi (A) \, \rm d A} \end{align*} $$

Does that make sense?

1

There are 1 best solutions below

1
On

I'm not sure if I'm missing something, but since you say you only observe the data if $x \leq k$, you would never have a situation where you have to update with an observed data point that is greater than $k$, right?

Here is a way to update if you see an observation $y$ that is less than or equal to $k$ (which must account for the fact that we observe data iff they are less than or equal to $k$). Note that your observed data $y$ is actually being generated from the conditional distribution of $X$ given that $X\leq k$. Therefore, if $Y$ is the random variable of your observed data, we have an update

$$ \begin{align*} \pi (A \mid y) \propto \color{blue}{\pi(y\mid A)} \pi (A), \end{align*} $$ where $\pi (A)$ is the density of the parameter $A$ (which you know since you know the distribution of $A$), and $\pi(y\mid A)$ is the density of our observed data given $A$. Since we said that we observe $X$ iff it is less than or equal to $k$, this last density is just $$\pi(y\mid A) = \pi(y\mid A,\, y\leq k).$$

To get this from the density of $X\mid A$, recall that if $Z$ is a random variable with density $f_{Z}$, then the conditional density of $Z$ given $Z \leq k$ is $$f_{Z\mid Z\leq k}(z) = \dfrac{f_{Z}(z)I(z\leq k)}{P(Z\leq k)},$$

where $I(\cdot)$ represents an indicator function and $P(Z\leq k)$ is the probability that $Z\leq k$. Therefore, we have $$\color{blue}{\pi(y\mid A)} = \frac{\pi_{X\mid A}(y)I(y\leq k)}{P(X\leq k\mid A)},$$ where $\pi_{X\mid A}(\cdot)$ is the density of a $\mathcal{N}(A, \sigma_{x})$ random variable.

Note that since you should only observe data $y$ if $y\leq k$, the indicator will be equal to $1$.