Lemma : If we have a Bayesian linear model: $$y = X\beta + w$$ where $\beta \sim N(0, I_d)$ (prior parameter) and $w \sim N(0,I_n)$, then $\beta$ conditioned on $y$ (posterior parameter) is Gaussian with covariance $(X^TX + I_d)^{-1}$ and mean $(X^TX + I_d)^{-1}X^Ty$ meaning: $$\{\beta| y \} = N((X^TX + I_d)^{-1}X^Ty, (X^TX + I_d)^{-1})$$ $N(\mu, \sigma)$ is Gaussian distribution with mean $\mu$ and covariance $\sigma$.
In order to determine conditioned $\beta$, my book uses decomposing $\beta$ to $\beta = Ay + z$ where $A \in \mathbb{R}^{d \times n}$ ($n$ is number of samples and $d$ is number of predictors and rank of $X$) matrix to prove that $\{\beta| y \}$ is jointly Gaussian and also to ease up finding the covariance.
To show that, $z$ and $y$ need to be independent from each other and individually Gaussian distributed. $y$ is indeed Gaussian because of the $\beta$ and $w$ being independent and Gaussian and linear transformation $X$ doesn't change anything about it. For $z$ to be independent from $y$ is needs to satisfy that $E[zy] = E[z]E[y]$ and to get that we need $A = (X^TX + I_d)^{-1}X^T$.
$z$ is some random variable, independent from $y$, that for the sake of proving joint Gaussian property is found to be $z \sim N(0, (X^TX + I_d)^{-1})$
Finally, since we have proven all these properties we can derive distribution for $\{\beta| y \}$. Mean of $\{\beta| y \}$ comes from: $$E[\beta| y ] = E[Ay + z|y] = E[Ay | y] + E[z|y] = Ay + 0 = (X^TX + I_d)^{-1}X^Ty$$ Note that $E[Ay | y]$ is giving constant since $Ay$ is value conditioned on $y$ (itself) and A is a matrix.
What I have problems is how to derive covariance for this $\{\beta| y \}$. From : $$E[\beta\beta^T | y] = E[(Ay + z)(Ay + z)^T | y] = E[Ayy^TA^T + Ayz^T + zy^TA^T + zz^T | y] $$ $$ E[\beta\beta^T | y] = E[Ayy^TA^T|y] + E[Ayz^T|y] + E[zy^TA^T|y] + E[zz^T|y]$$
we know that y and A are constant in these expectation so they can be move from the expectation brackets and so we get :
$$ E[\beta\beta^T | y] = Ayy^TA^T + AyE[z^T|y] + E[z|y]y^TA^T + E[zz^T|y]$$
Since we know $z$ is independent from $y$ we get that $AyE[z^T|y] = 0$ and $E[z|y]y^TA^T = 0$ and we get that: $$ E[\beta\beta^T | y] = Ayy^TA^T + E[zz^T|y] = Ayy^TA^T + (X^TX + I_d)^{-1}$$
My own steps are those for calculating covariance. My question how should I continue to finally get that $$ E[\beta\beta^T | y] = (X^TX + I_d)^{-1}) $$
Yeah, I assumed wrong my covariance, since I forgot that I have to substract mean in covariance formula, which was always 0 for previous distributions, therefore since $E[\beta | y] = (X^TX + I_d)^{-1}X^Ty = A y$ and $\beta = Ay + z$:
$$ Cov(\beta | y) = E[(\beta - E[\beta | y])(\beta - E[\beta | y])^T | y] = E[(Ay + z - Ay)(Ay + z - Ay)^T | y] = E[zz^T |y] = E[zz^T] = (X^TX + I_d)^{-1}$$