Where is the noise term in a generative model?

120 Views Asked by At

It seems for a generative model, we always assume that the observed data point is generated by some unknown distributions, rather than by some noise. Where do we encode the noise in a generative model? Do we encode it at all?

1

There are 1 best solutions below

0
On BEST ANSWER

It seems for a generative model, we always assume that the observed data point is generated by some unknown distributions, rather than by some noise.

Well, the "unknown distribution" is presumably a probability distribution with some entropy (i.e., randomness). This itself means there is stochasticity in the sampling and thus in the resulting values as well.

Take the simplest example: a GMM with one component. In other words, fitting a Gaussian distribution to a dataset $X\in \mathbb{R}^{m \times n}$ with $m$ data points. Given $X$, we estimate the mean $\mu$ and covariance $\Sigma$, and our resulting generative model is simply: $$ x \sim \mathcal{N}(\mu,\Sigma) $$ Notice that a random sample $x$ is not going to be from $X$. In other words, there is some randomness in the samples. This presumably reflects the inherent randomness in the data itself (though of course the model is wrong to some extent, except in very special circumstances).

Also for this special case, notice that we can rewrite any generated sample via: $$ x = \mu + \sqrt{\Sigma} z, \;\; z\sim \mathcal{N}(0,I_n) $$ where I've assumed positive definiteness of $\Sigma$. Notice that there is exactly a noise term being added to the inferred parameters of the generative model. In other words, the noise is being encoded in the distribution we have parameterized itself, in this case.

A slightly more complicated example is for a VAE. A simple, but common, form for a VAE has a latent variable $z$ with a prior $p(z)$, a stochastic encoder distribution $q_\phi(z|x)$, and a stochastic decoder distribution $p_\theta(x|z)$. We learn $\phi$ and $\theta$. But, each time the network is run, we sample from these conditional distributions for encoding and/or decoding - this injects noise into the generative model. (Not to mention, in this case, the additional source of stochasticity induced when sampling from the prior when generating).