For questions related to variational auto-encoders (VAEs). The first VAE was proposed in "Auto-Encoding Variational Bayes" (2013) by Diederik P. Kingma and Max Welling. There are several other VAEs, for example, the conditional VAE.
Questions tagged [variational-autoencoder]
115 questions
12
votes
1 answer
What are the fundamental differences between VAE and GAN for image generation?
Starting from my own understanding, and scoped to the purpose of image generation, I'm well aware of the major architectural differences:
A GAN's generator samples from a relatively low dimensional random variable and produces an image. Then the…

Alexander Soare
- 1,319
- 2
- 11
- 26
9
votes
3 answers
Why is the variational auto-encoder's output blurred, while GANs output is crisp and has sharp edges?
I observed in several papers that the variational autoencoder's output is blurred, while GANs output is crisp and has sharp edges.
Can someone please give some intuition why that is the case? I did think a lot but couldn't find any logic.

Trect
- 269
- 1
- 4
- 7
8
votes
3 answers
How does backprop work through the random sampling layer in a variational autoencoder?
Implementations of variational autoencoders that I've looked at all include a sampling layer as the last layer of the encoder block. The encoder learns to generate a mean and standard deviation for each input, and samples from it to get the input's…

Luke Wolcott
- 183
- 4
7
votes
1 answer
Why doesn't VAE suffer mode collapse?
Mode collapse is a common problem faced by GANs. I am curious why doesn't VAE suffer mode collapse?

Trect
- 269
- 1
- 4
- 7
7
votes
2 answers
How is this Pytorch expression equivalent to the KL divergence?
I found the following PyTorch code (from this link)
-0.5 * torch.sum(1 + sigma - mu.pow(2) - sigma.exp())
where mu is the mean parameter that comes out of the model and sigma is the sigma parameter out of the encoder. This expression is apparently…

user8714896
- 717
- 1
- 4
- 21
6
votes
1 answer
How should we choose the dimensions of the encoding layer in auto-encoders?
How should we choose the dimensions of the encoding layer in auto-encoders?

Neha soni
- 101
- 3
6
votes
1 answer
Why is the evidence equal to the KL divergence plus the loss?
Why is the equation $$\log p_{\theta}(x^1,...,x^N)=D_{KL}(q_{\theta}(z|x^i)||p_{\phi}(z|x^i))+\mathbb{L}(\phi,\theta;x^i)$$ true, where $x^i$ are data points and $z$ are latent variables?
I was reading the original variation autoencoder paper and I…

user8714896
- 717
- 1
- 4
- 21
5
votes
1 answer
What does the notation $\mathcal{N}(z; \mu, \sigma)$ stand for in statistics?
I know that the notation $\mathcal{N}(\mu, \sigma)$ stands for a normal distribution.
But I'm reading the book "An Introduction to Variational Autoencoders" and in it, there is this notation:
$$\mathcal{N}(z; 0, I)$$
What does it mean?
picture of…

Peyman
- 534
- 3
- 10
5
votes
1 answer
Concrete example of latent variables and observables plugged into the Bayes' rule
In the context of the variational auto-encoder, can someone give me a concrete example of the application of the Bayes' rule
$$p_{\theta}(z|x)=\frac{p_{\theta}(x|z)p(z)}{p(x)}$$
for a given latent variable and observable?
I understand with VAE's…

user8714896
- 717
- 1
- 4
- 21
5
votes
1 answer
Does MMD-VAE solve the problem of blurred images of vanilla VAEs?
I understand that with vanilla VAEs, there are a few reasons justifying the production of blurred out images. The InfoVAE paper describes the case when the decoder is flexible enough to ignore the latent attributes and generate an averaged out image…

Ananda
- 148
- 9
4
votes
2 answers
How to generate new data given a trained VAE - sample from the learned latent space or from multivariate Gaussian?
To generate synthetic dataset using a trained VAE, there is confusion between two approaches:
Use learned latent space: z = mu + (eps * log_var) to generate (theoretically, infinite amounts of) data. Here, we are learning mu and log_var vectors…

Arun
- 225
- 1
- 8
4
votes
1 answer
Why would a VAE train much better with batch sizes closer to 1 over batch size of 100+?
I've been training a VAE to reconstruct human names and when I train it on a batch size of 100+ after about 5 hours of training it tends to just output the same thing regardless of the input and I'm using teacher forcing as well. When I use a lower…

user8714896
- 717
- 1
- 4
- 21
4
votes
1 answer
What is the impact of scaling the KL divergence and reconstruction loss in the VAE objective function?
Variational autoencoders have two components in their loss function. The first component is the reconstruction loss, which for image data, is the pixel-wise difference between the input image and output image. The second component is the…

rich
- 151
- 6
4
votes
1 answer
Why does the variational auto-encoder use the reconstruction loss?
VAE is trained to reduce the following two losses.
KL divergence between inferred latent distribution and Gaussian.
the reconstruction loss
I understand that the first one regularizes VAE to get structured latent space. But why and how does the…

Jun
- 89
- 5
4
votes
0 answers
Why can't VAE do sequence to sequence name generation?
I'm working on research in this sector where my supervisor wants to do cannonicalization of name data using VAEs, but I don't think it's possible to do, but I don't know explicitly how to show it mathematically. I just know empirically that VAEs…

user8714896
- 717
- 1
- 4
- 21