I'm currently studying about the Bayes' estimator using the book Introduction to Machine Learning (Alpaydin, 2014) and had a question regarding the calculations for the derivation. The specific part in the passage says:
Let us suppose that $x^t \sim \mathcal{N}(\theta, \sigma^2)$ and $\theta \sim \mathcal{N}(\mu_0, \sigma_0^2)$, where $\mu_0$, $\sigma_0^2$, and $\sigma^2$ are known:
$$ \begin{align} P(\mathcal{X} | \theta) & = \dfrac{1}{(2 \pi)^{N / 2} \sigma^N}\text{exp} \left({-\frac{\sum_t (x^t - \theta)^2}{2\sigma^2}}\right) \\ P(\theta) & = \dfrac{1}{(2\pi)^{1/2} \sigma_0} \text{exp} \left( -\frac{(\theta - \mu_0)^2}{2\sigma^2_0} \right) \end{align} $$
It can be shown that $P(\theta | \mathcal{X})$ is normal with
$$\mathrm{E}[\theta | \mathcal{X}] = \frac{N / \sigma^2}{N/\sigma^2 + 1/\sigma_0^2} m + \frac{1/\sigma_0^2}{N/\sigma^2 + 1/\sigma_0^2} \mu_0$$
The main parts that I'm having trouble understanding are 1) the calculation to get the conditional expectation shown above and 2) how this result means that $P(\theta | \mathcal{X})$ is normally distributed.
I am assuming that this has to do with the conjugate prior distribution as per this community question but is it enough to look at the form and observing that it is the conjugate prior, then conclude that the distribution of $\theta | \mathcal{X}$ is normal?
Regarding the calculation part, I attempted to calculate the equation by:
$$ \begin{align} \mathrm{E}[\theta | \mathcal{X}] & = \int\theta P(\theta | \mathcal{X}) d\theta \\ & = \int \theta \dfrac{P(\mathcal{X} | \theta)P(\theta)}{P(\mathcal{X})} d\theta \end{align}$$
and plugging in the appropriate PDF's into the equation. Is this approach correct?
Any tips or feedback are appreciated. Thanks in advance.