Showing $ \int_\mathbb{R} \mathbb{E}_\theta[X\mid Y]N(\mu,\sigma^2)(d\theta) = \frac{\mu+\sigma^2X}{1+\sigma^2}. $ [In relation to Bandit Problems]

71 Views Asked by At

Suppose that $X,Y \sim N(\theta,1)$ where $\theta$ is an outcome of the prior distribution $N(\mu,\sigma^2)$. Then I want to verify that $$ \int_\mathbb{R} \mathbb{E}_\theta[X\mid Y]N(\mu,\sigma^2)(d\theta) = \frac{\mu+\sigma^2X}{1+\sigma^2}. $$ Can anyone help me? In particular, I am stuck trying to work on $\mathbb{E}_\theta[X\mid Y]$. I have tried writing this out as: $$ \mathbb{E}_\theta[X\mid Y=y]=\int_\mathbb{R}xf_{X\mid Y}(x \mid y)dx $$ but this doesn't seem to lead anywhere. I need to use that I know both the marginals distributions for $X$ and $Y$ and that inside the outer integral $\theta$ is fixed.


To provide a bit of context, this is from reformulated from an example of strategies in bandit problems in a Reinforcement Learning text book ($X$ and $Y$ being rewards in different rounds). As I am fairly new to the Bayesian point of view, this might be a simple problem but I still struggle quite a bit.