Neural network - function estimation

575 Views Asked by At

My questions are about the paper: Semi-supervised Learning with Deep Generative Models (Kingma, D.P. et al, 2014).

Suppose I have a generative network with input $x$, and a hidden layer with hidden nodes $z$. We use $z$ to generate $x$.

Now suppose my prior distribution on the hidden units is a spherical Gaussian:

$p(z) \sim N(z \mid 0, I)$

And I want to obtain a posterior distribution $q_\phi (z \mid x) = N(z \mid \mu_\phi(x), {\rm{diag}}(\sigma_{\phi}^2 (x)))$.

I parameterize a multi-layer perceptron (MLP) and use that MLP to learn $\mu_{\phi}(x)$ and $\sigma_{\phi}(x)$.

My questions are:

1) How do the parameters $\phi$ of the MLP determine the mean and variance?

2) How does that work mathematically? How am I getting the posterior mean and variance from this MLP?

3) Why do I need an MLP at all?