Highest Voted 'evidence-lower-bound' Questions - Artificial Intelligence Stack Exchange

6

votes

1 answer

Why is the evidence equal to the KL divergence plus the loss?

Why is the equation $$\log p_{\theta}(x^1,...,x^N)=D_{KL}(q_{\theta}(z|x^i)||p_{\phi}(z|x^i))+\mathbb{L}(\phi,\theta;x^i)$$ true, where $x^i$ are data points and $z$ are latent variables? I was reading the original variation autoencoder paper and I…

asked Feb 07 '20 at 07:49

user8714896

717
1
4
21

4

votes

1 answer

Why does the variational auto-encoder use the reconstruction loss?

VAE is trained to reduce the following two losses. KL divergence between inferred latent distribution and Gaussian. the reconstruction loss I understand that the first one regularizes VAE to get structured latent space. But why and how does the…

deep-learning math variational-autoencoder evidence-lower-bound

asked Mar 26 '20 at 05:22

Jun

89
5

4

votes

2 answers

What's going on in the equation of the variational lower bound?

I don't really understand what this equation is saying or what the purpose of the ELBO is. How does it help us find the true posterior distribution?

deep-learning statistical-ai variational-autoencoder evidence-lower-bound

asked Oct 21 '19 at 05:32

Gooby

351
2
10

3

votes

1 answer

What does the approximate posterior on latent variables, $q_\phi(z|x)$, tend to when optimising VAE's

The ELBO objective is described as follows $$ ELBO(\phi,\theta) = E_{q_\phi(z|x)}[log p_\theta (x|z)] - KL[q_\phi (z|x)||p(z)] $$ This form of ELBO includes a regularisation term in the form of the KL divergence which drives $q_\phi(z|x)…

variational-autoencoder evidence-lower-bound variational-inference

asked Apr 23 '21 at 14:36

quest ions

384
1
8

2

votes

1 answer

Clarification on the training objective of denoising diffusion models

I'm reading the Denoising Diffusion Probabilistic Models paper (Ho et al. 2020). And I am puzzled about the training objective. I understood (I think) the trick regarding the reparametrization of the variance in terms of the noise: $$\mu_\theta(x_t,…

generative-model diffusion-models evidence-lower-bound latent-variable

asked Apr 21 '23 at 13:27

user3903647

21
1

2

votes

2 answers

In variational autoencoders, why do people use MSE for the loss?

In VAEs, we try to maximize the ELBO = $\mathbb{E}_q [\log\ p(x|z)] + D_{KL}(q(z \mid x), p(z))$, but I see that many implement the first term as the MSE of the image and its reconstruction. Here's a paper (section 5) that seems to do that: Don't…

objective-functions autoencoders variational-autoencoder mean-squared-error evidence-lower-bound

asked Apr 15 '21 at 08:43

IttayD

189
1
5

2

votes

1 answer

How does the implementation of the VAE's objective function equate to ELBO?

For a lot of VAE implementations I've seen in code, it's not really obvious to me how it equates to ELBO. $$L(X)=H(Q)-H(Q:P(X,Z))=\sum_ZQ(Z)logP(Z,X)-\sum_ZQ(Z)log(Q(Z))$$ The above is the definition of ELBO, where $X$ is some input, $Z$ is a latent…

implementation variational-autoencoder cross-entropy evidence-lower-bound categorical-crossentropy

asked Nov 12 '20 at 15:37

user8714896

717
1
4
21

2

votes

1 answer

In this VAE formula, why do $p$ and $q$ have the same parameters?

In $$\log p_{\theta}(x^1,...,x^N)=D_{KL}(q_{\theta}(z|x^i)||p_{\phi}(z|x^i))+\mathbb{L}(\phi,\theta;x^i),$$ why does $p(x^1,...,x^N)$ and $q(z|x^i)$ have the same parameter $\theta?$ Given that $p$ is just the probability of the observed data and…

variational-autoencoder latent-variable evidence-lower-bound

asked Feb 07 '20 at 21:32

user8714896

717
1
4
21

2

votes

0 answers

Why does the ELBO come to a steady state and the latent space shrinks?

I'm trying to train a VAE using a graph dataset. However, my latent space shrinks epoch by epoch. Meanwhile, my ELBO plot comes to a steady state after a few epochs. I tried to play around with parameters and I realized, by increasing the batch size…

variational-autoencoder graphs evidence-lower-bound

asked Dec 12 '19 at 01:54

Blade

151
1
6

1

vote

1 answer

If we know the joint distribution, can we simply derive the evidence from it?

I'm struggling to understand one specific part of the formalism of the free energy principle. My understanding is that the free energy principle can be derived from considering statistical dynamics of a system that is coupled with its environment in…

machine-learning evidence-lower-bound variational-inference

asked Apr 27 '23 at 17:22

Gustavo

11
2

1

vote

0 answers

Is VAE the same as the E-step of the EM algorithm?

EM(Expectation Maximum) Target: maximize $p_\theta(x)$ $ p_\theta(x)=\frac{p_\theta(x, z)}{p_\theta(z \mid x)} \\\\$ Take log on both sides: $ \log p_\theta(x)=\log p_\theta(x, z)-\log p_\theta(z \mid x) \\\\$ Introduce distribution $q_\phi(z)$: $…

machine-learning variational-autoencoder evidence-lower-bound maximum-likelihood expectation-maximization

asked Dec 31 '22 at 06:40

Garfield

11
1

1

vote

0 answers

variational inference but with a weighted loglikelihood

I would like to know if it's correct if I substitute in the ELBO formula a weighted sum of the loglikelihood $$\sum E_{q_{\theta}(w)}[w_i \ln{p(y_i|f^{w}(x_i))}]$$ in place of the traditional sum. My problem is that my dataset comes with the…

bayesian-networks bayesian-deep-learning bayesian-optimization evidence-lower-bound

asked Sep 05 '22 at 07:47

Alucard

111
1

1

vote

1 answer

How is the variational lower bound for hard attention derived in Show, Attend and Tell

How is the jump from line 1 to line 2 done in equation 10 of Show, Attend and Tell? While we're at it, another thing that might be muddying the waters for me is that I'm not clear on what the sum is over. I know that $s$ is indexed as $s_{t,i}$,…

papers math attention evidence-lower-bound

asked Apr 26 '21 at 09:47

Alexander Soare

1,319
2
11
26

0

votes

1 answer

Confusion over taking gradients in Variational Autoencoders (VAE)

I am confused as to when to hold certain parameters constant in a VAE. I will explain with a concrete example. We can write $\operatorname{ELBO}(\phi, \theta) = \mathbb{E}_{q_{\phi}(z)}\left[\log \left(p_{\theta}(x| z)\right)\right] -…

neural-networks machine-learning backpropagation variational-autoencoder evidence-lower-bound

asked Aug 11 '23 at 16:58

Joel

23
3

0

votes

1 answer

Why is the variational lower bound is easier to compute than the original marginal distribution?

Why is the ELBO of $p(x)=\int p(x|z)p(z)\mathrm{d}z$ easier to compute/estimate than the expression itself? Can we compute this quantity itself through sampling in the same way? I understanding that aggregating over data means taking log before…

autoencoders variational-autoencoder evidence-lower-bound

asked Mar 05 '23 at 01:32

Hanhan Li

101
1

Questions tagged [evidence-lower-bound]