Covariance for Optimal Bayes Estimator using Gaussian distributions

Question

Covariance for Optimal Bayes Estimator using Gaussian distributions

31 Views Asked by Bumbble Comm At 27 Mar 2026 - 11:46

Lemma : If we have a Bayesian linear model: $$y = X\beta + w$$ where $\beta \sim N(0, I_d)$ (prior parameter) and $w \sim N(0,I_n)$, then $\beta$ conditioned on $y$ (posterior parameter) is Gaussian with covariance $(X^TX + I_d)^{-1}$ and mean $(X^TX + I_d)^{-1}X^Ty$ meaning: $$\{\beta| y \} = N((X^TX + I_d)^{-1}X^Ty, (X^TX + I_d)^{-1})$$ $N(\mu, \sigma)$ is Gaussian distribution with mean $\mu$ and covariance $\sigma$.

In order to determine conditioned $\beta$, my book uses decomposing $\beta$ to $\beta = Ay + z$ where $A \in \mathbb{R}^{d \times n}$ ($n$ is number of samples and $d$ is number of predictors and rank of $X$) matrix to prove that $\{\beta| y \}$ is jointly Gaussian and also to ease up finding the covariance.

To show that, $z$ and $y$ need to be independent from each other and individually Gaussian distributed. $y$ is indeed Gaussian because of the $\beta$ and $w$ being independent and Gaussian and linear transformation $X$ doesn't change anything about it. For $z$ to be independent from $y$ is needs to satisfy that $E[zy] = E[z]E[y]$ and to get that we need $A = (X^TX + I_d)^{-1}X^T$.

$z$ is some random variable, independent from $y$, that for the sake of proving joint Gaussian property is found to be $z \sim N(0, (X^TX + I_d)^{-1})$

Finally, since we have proven all these properties we can derive distribution for $\{\beta| y \}$. Mean of $\{\beta| y \}$ comes from: $$E[\beta| y ] = E[Ay + z|y] = E[Ay | y] + E[z|y] = Ay + 0 = (X^TX + I_d)^{-1}X^Ty$$ Note that $E[Ay | y]$ is giving constant since $Ay$ is value conditioned on $y$ (itself) and A is a matrix.

What I have problems is how to derive covariance for this $\{\beta| y \}$. From : $$E[\beta\beta^T | y] = E[(Ay + z)(Ay + z)^T | y] = E[Ayy^TA^T + Ayz^T + zy^TA^T + zz^T | y] $$ $$ E[\beta\beta^T | y] = E[Ayy^TA^T|y] + E[Ayz^T|y] + E[zy^TA^T|y] + E[zz^T|y]$$

we know that y and A are constant in these expectation so they can be move from the expectation brackets and so we get :

$$ E[\beta\beta^T | y] = Ayy^TA^T + AyE[z^T|y] + E[z|y]y^TA^T + E[zz^T|y]$$

Since we know $z$ is independent from $y$ we get that $AyE[z^T|y] = 0$ and $E[z|y]y^TA^T = 0$ and we get that: $$ E[\beta\beta^T | y] = Ayy^TA^T + E[zz^T|y] = Ayy^TA^T + (X^TX + I_d)^{-1}$$

My own steps are those for calculating covariance. My question how should I continue to finally get that $$ E[\beta\beta^T | y] = (X^TX + I_d)^{-1}) $$

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

Yeah, I assumed wrong my covariance, since I forgot that I have to substract mean in covariance formula, which was always 0 for previous distributions, therefore since $E[\beta | y] = (X^TX + I_d)^{-1}X^Ty = A y$ and $\beta = Ay + z$:

$$ Cov(\beta | y) = E[(\beta - E[\beta | y])(\beta - E[\beta | y])^T | y] = E[(Ay + z - Ay)(Ay + z - Ay)^T | y] = E[zz^T |y] = E[zz^T] = (X^TX + I_d)^{-1}$$

Covariance for Optimal Bayes Estimator using Gaussian distributions

There are 1 best solutions below

Related Questions in REGRESSION

Related Questions in BAYESIAN

Related Questions in LINEAR-REGRESSION

Trending Questions

Popular # Hahtags

Popular Questions