I am learning about variational inference and Gibbs simpler. I am in the process of deriving variational inference on my own. In this process, I need to make a comparison with Gibbs sampler. I am reading the book "Machine learning a probabilistic perspective" by Kevin Murphy and I found this sentence that I need help with
"Since we are replacing the neighboring values by their mean value, the method is known as mean field. This is very similar to Gibbs sampling (Section 24.2), except instead of sending sampled values between neighboring nodes, we send mean values between nodes."
Someone please explain how Gibbs sampling sends sampled values between neighbors and VI sends mean values between nodes. I am looking for an in-depth explanation that shows what is what and how it is meant by this comparison.
Let's call $q(\textbf x)$ the distribution that we're trying to sample from where $\textbf x$ (usually represented by $\theta$) is the parameter vector $(x_1,\dots,x_d)$.
Gibbs sampling works by drawing values from the conditional distribution of $x_i$ using the value passed on from the previous iterations, hence 'sending sampled values across neighboring nodes.' That is, Gibbs sampling works by: Initialize $\textbf x^{(0)}=(x_1^{(0)}, \dots, x_d^{(0)})$ On each iteration, generate each coordinate through using the most recent value: \begin{align}x_1^{(j)}&\sim q_1(x_1|x_2^{(j-1)},\dots,x_d^{(j-1)})\\ x_2^{(j)}&\sim q_2(x_2|x_1^{(j)}, x_3^{(j-1)},\dots,x_d^{(j-1)})\\ &\vdots\\ x_d^{(j)}&\sim q_d(x_d|x_1^{(j)},\dots,x_{d-1}^{(j)})\end{align}
Mean field method first assumes that the posterior distribution factorizes into some subset of the parameters:
$$q(\textbf x) = \prod _{i=1}^D q_i(\textbf x_i)$$ where $D<d$.
The update rule for each iteration is:
$$\log q_j(\textbf x_j) = \mathbb E_{-q_j}[\log \tilde p(\textbf x)]+\text{const}$$
where for example if there are three parameters $\mathbb E_{-q_2}[f(\textbf x)] = \sum_{x_1}\sum_{x_3}q(x_1)q(x_3)f(x_1,x_2,x_3)$
This can be written iteratively for one iteration as:
\begin{align} \log q_1(\textbf x_1) &= \mathbb E_{-q_1}[\log \tilde p(\textbf x)]+\text{const}\\ \log q_2(\textbf x_2) &= \mathbb E_{-q_2}[\log \tilde p(\textbf x)]+\text{const}\\ \vdots\\ \log q_D(\textbf x_D) &= \mathbb E_{-q_D}[\log \tilde p(\textbf x)]+\text{const} \end{align}
Since we are using the Expected Value (mean) of say $q_1$ in the second step for $q_2$'s expected value this is like passing the mean from step to step.