I have a normal distribution with unknown mean $\mu$. I have a prior of $mean = \mu_0, var=\sigma_0^2$
Say I have $n$ observations from this distribution:
$$A_1, A_2, .....A_n$$
What is $$E(\mu | A_1, A_2, A_3 .... A_n)?$$
My attempt:
Isn't this just the average over all the observations? (But this looks like a frequentist approach). If we have prior, does this mean that we are adopting a Bayesian approach?
We assume the mean is normally distributed, i.e., $p(x|\mu) \sim {\cal N}(\mu, \sigma^2)$ where we seek to estimate the mean $\mu$. We assume we have prior information over the mean, i.e., $p(\mu ) \sim {\cal N}(\mu_0, \sigma_0^2)$ where both $\mu_0$ and $\sigma_0$ are known.
One computes the measured sample mean for the $n$ points, $\hat{\mu}_n$. Then, after a number of steps of the derivation, we find the updated estimate of the mean is:
$$\mu_n = \left( \frac{n \sigma_0^2}{n \sigma_0^2 + \sigma^2} \right) \hat{\mu}_n + \frac{\sigma^2}{n \sigma_0^2 + \sigma^2} \mu_0$$
and
$$\sigma_n^2 = \frac{\sigma_0^2 \sigma^2}{n \sigma_0^2 + \sigma^2}$$.
Notice that if the number of points $n$ is very large, the mean gets dominated by the new estimated mean. Alternatively, if $n$ is very small, the mean remains close to what is given by the prior. The extreme: if $n \to \infty$, then the mean is $\hat{\mu}_\infty$, but if $n = 0$ the mean is $\mu_0$... just as you would expect.
(See Section 3.4 of Pattern classification (2nd, ed.) by Duda, Hart, and Stork (Wiley, 2001).)