Formula of PRML(3.63)

Question

Formula of PRML(3.63)

328 Views Asked by Bumbble Comm At 10 May 2026 - 11:40

I'm not good at English, so I apologize in advance.

I have a question in "Pattern Recognition and Machine Learning", and it is a formula which is described in ch3.3.3 and (3.63)

I dont understand how we can transform (1) to (2), could anyone teach me ??

$ cov[y(x), y(x')] = cov[φ(x)^Tw,w^Tφ(x')] = φ(x)^TS_Nφ(x') = β^{-1}k(x,x')$

Original Q&A

There are 2 best solutions below

**Bumbble Comm** · Answer 1 · 2018-06-27 07:49:45

Let the prior $Q_0$ over the weights be given by the density function $$ P(w|\alpha)=\mathcal{N}(0,\alpha^{-1}I) $$ so that the posterior over the weights $Q_w$ given the training data targets $\vec{t}$ can be written (as a density) $$ P(w|\vec{t}) = \mathcal{N}(w|m_N,S_N)\\ m_N = \beta S_N\Phi^T\vec{t}\\ S_N^{-1}= \text{cov}(w)^{-1}=\alpha I + \beta\Phi^T\Phi $$ The predictive distribution $Q_p(x)$ (for input $x$) then has density computed via $$ P(t|x,\vec{t},\alpha,\beta)=\int P(t|x,w,\beta) P(w|\vec{t},\alpha,\beta) dw=\mathcal{N}(t|m_N^T\phi(x),\beta^{-1}+\phi(x)^TS_N\phi(x)) $$ Thus the model predictions are distributed via $$y(x) = w^T\phi(x) \sim Q_p(x) \tag{0} $$ so the predictive mean is simply $$ y(x,m_N) = \mathbb{E}[y(x)] = m_N^T\phi(x) \tag{1} $$ Notice that the covariance of the weights under the posterior is given by $$ \text{cov}(w) = S_N \tag{2} $$ And that the following formula holds (for covariances of vector-valued random variables in general) $$ \text{cov}(w) + \mathbb{E}[w] \mathbb{E}[w]^T = \mathbb{E}[ww^T] \tag{3} $$ One more note (eq. 3.62 in the book): $$ y(x,m_N)=\beta S_N\Phi^T \vec{t}=\sum_iK(x,x_i)t_i$$ $$\therefore \beta^{-1}K(x,x_i) = \phi(x)^TS_N\phi(x_i)\tag{4} $$ Ok, so what is the covariance between our scalar predictions for two inputs? $$ \text{cov}[y(x_1), y(x_2)] = \text{cov}[w^T\phi(x_1), w^T\phi(x_2)] $$ using (0). Then using the identity for covariance between scalars $$\text{cov}[s_1,s_2]=\mathbb{E}[s_1s_2] - \mathbb{E}[s_1]\mathbb{E}[s_2]$$ we get \begin{align} \text{cov}[w^T\phi(x_1), w^T\phi(x_2)] &= \mathbb{E}[\phi(x_1)^T w w^T\phi(x_2)] - \phi(x_1)^Tm_Nm_N^T\phi(x_2) \\ &= \phi(x_1)^T\mathbb{E}[ w w^T]\phi(x_2) - \phi(x_1)^Tm_Nm_N^T\phi(x_2) \\ &= \phi(x_1)^T\left[ \text{cov}(w) + \mathbb{E}[w] \mathbb{E}[w]^T\right]\phi(x_2) - \phi(x_1)^Tm_Nm_N^T\phi(x_2) \\ &= \phi(x_1)^T S_N \phi(x_2) + \phi(x_1)^T m_Nm_N^T \phi(x_2) - \phi(x_1)^Tm_Nm_N^T\phi(x_2)\\ &= \phi(x_1)^T S_N \phi(x_2) \\ &= \beta^{-1}K(x_1,x_2) \end{align} using (1) for line 1, linearity for line 2, (3) for line 3, (2) for line 4 and (4) for the last line.

**Bumbble Comm** · Answer 2 · 2021-11-05 13:15:19

I meant to comment on the previous answer but don't have enough reps to do so yet :) For the computation of the last step, it seems to me that it can be simplified by taking out $\phi(x_1)$ and $\phi(x_2)$ first (they are scalers), i.e.,

$$\mathrm{cov}[\phi(x_1)^Tw, w^T\phi(x_2)] = \phi(x_1)^T\mathrm{cov}[w, w^T]\phi(x_2) = \phi(x_1)^T\mathrm{var}(w)\phi(x_2) = \phi(x_1)^TS_N\phi(x_2).$$

So the derivation of 3.63 is more straightforward?

Formula of PRML(3.63)

There are 2 best solutions below

Related Questions in PATTERN-RECOGNITION

Trending Questions

Popular # Hahtags

Popular Questions