Doubt in understanding GANs

147 Views Asked by At

I was going through original GAN paper: Goodfellow, Ian, et al. "Generative adversarial nets." Advances in neural information processing systems. 2014. Link: http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf

For proving optimal D, eq 2, they have rewritten the objective function in equation 3. It is: Equation 3 image

So, essentially they have changed p(z) to p(x) and g(z) to x. My question is how can this be done?

p.s: Is this the correct place to ask such question? Is there a dedicated place where I can ask questions related to specific sub topics of ML?

1

There are 1 best solutions below

0
On

The change of variables in the proof of Proposition 1 of Goodfellow et al's 2014 GAN paper is valid. However, one needs to pay particular attention to the dimension of the latent variable z and the data variable x in the transformation x=G(z). (Aside: we really should use $\hat{x}$ for the generator output variable rather than x.) Everything is written as if it's a scalar and this is confusing. It turns out that when dim(z) $\geq$ dim(x) everything is fine but when dim(z)<dim(x), which applies in practical image synthesis, the PDF of the generator output $p_g(x)$, is degenerate - i.e. it contains delta functions and it is also non-unique! The change of variable formula still works, but (and this is the clincher) the next part of the proof of Proposition 1 does not hold. This is because variational calculus has been used and the integrand needs to be continuously differentiable with respect to x and D(), which it clearly isn't when delta functions are present. This means that equation (3) holds but the optimal discriminator does not exist when dim(z)<dim(x). This assertion has recently been demonstrated in a paper from Google researchers 1 who applied ODE's to implement GANs (rather than straight stochastic gradient descent). They obtained the expected convergence for a dim(x)=2 < dim(z)=32 example, but not for a dim(x)=3072 > dim(z)=128 example (Cifar-10).

A clear explanation with low dimensional examples is contained in section 2 of the paper https://www.researchgate.net/publication/356815736_Convergence_and_Optimality_Analysis_of_Low-Dimensional_Generative_Adversarial_Networks_using_Error_Function_Integrals