I am generating $\mathbf{x}^{(1)}, \mathbf{x}^{(2)}, \dots, \mathbf{x}^{(n)}$ using Gibbs sampling methods.
So I want $\mathbf{x}^{(1)}, \mathbf{x}^{(2)}, \dots, \mathbf{x}^{(n)} \sim$ some distribution P, and theses samples are independent, or almost independent tested by common statistics.
When using Gibbs sampling algorithm, we start from an arbitrary assignment state (or sampled from an arbitrary distribution $P^{(1)}$). Following the algorithm, the distribution is ``mixed'' after a long while, so that the distribution is close to $P$.
However, after the chain is mixed, we can continuously sample from the chain. I just wonder why. Whey needn't wait until re-mixing, or restart from a new random initital distribution?
For example, let's say after the chain is mixed, we sample $\mathbf{x}^{(1)}$. Then we sample $\mathbf{x}^{(2)}$. Certainly, $\mathbf{x}^{(1)}$ and $\mathbf{x}^{(2)}$ are not independent. But we can still use the second sample.
So, why is it correct? I.e, estimating from the sampled data is unbiased.
You are right. Sampling using MCMC is always correlated, due to the property of Markov chain. Usually, we apply autocorrelation to measure this kind of correlationship. As the sampling progresses, autocorrelation will decrease and converge to 0. A better sampling method will have a faster decay rate. So after a long time when the chain is adequately mixed, ergodicity of Markov chain will guarantee that you can continuously sample from the chain and meantime keep the distribution invariant as target distribution.