3

I know the encoder is variational posterior $q_{\phi}(\mathbf{z} \mid \mathbf{x})$.

I also know that the decoder represents the likelihood: $p_{\theta}(\mathbf{x} \mid \mathbf{z})$.

My question is about the prior $\mathrm{p}(\mathbf{z})$.

I know ELBO can be written as:

Eqϕ(zx)[log(pθ(xz))]DKL(qϕ(zx)p(z))log(pθ(x))

And for the VAE, the variational posterior is

q\boldsymbolϕ(zx(i))=N(\boldsymbolμ(i),\boldsymbolσ2(i)I),

and prior is

p(z)=N(\boldsymbol0,I).

So

DKL(qΦ(zx)pz(z))=12Jj=1(1+log(σ2j)σ2jμ2j)

That's one way I know the prior plays a role, in helping determine part of the loss function.

Is there any other role that the prior plays for the VAE?

a12345
  • 213
  • 1
  • 6

1 Answers1

2

The prior $p(z)$ is assumed as part of the problem formulation. A typical case is where $z$ is a vector of iid normal random variables. The ELBO involves a regularization term which encourages $q(z \, | \, x)$ to have a similar distribution to $p(z)$ (the way you've written it, that's the KL term). Thus $q(z \, | \, x)$ will end up having a similar shape to $p(z)$. For example, again assuming $z$ is a vector of iid normals, if you plot samples of $z$ drawn from $q(z \, | \, x)$ you will find it has a roughly spherical shape. If you scroll down to the [16] code block and look at the figure you'll see what I mean. The figure is plotting samples of $z$, colored according to what $x$ is (MNIST example). This is just some random figure I found, and I don't endorse this code, but the image is what you'd expect to see.

The way we end up with a distribution $p(x, z)$ is by using the prior. We sample $z$ according to $p(z)$; we've trained the decoder $p(x \, | \, z)$, and by definition $p(x, z) = p(x \, | \, z) p(z)$.

Taw
  • 1,161
  • 3
  • 10