For questions related to the task of image generation, which can be done, for example, with variational auto-encoders (VAEs) or generative adversarial networks (GANs).
Questions tagged [image-generation]
81 questions
12
votes
1 answer
What are the fundamental differences between VAE and GAN for image generation?
Starting from my own understanding, and scoped to the purpose of image generation, I'm well aware of the major architectural differences:
A GAN's generator samples from a relatively low dimensional random variable and produces an image. Then the…

Alexander Soare
- 1,319
- 2
- 11
- 26
10
votes
2 answers
Using AI to extend an imagine pattern
I have created some nice patterns using the MidJourney tool. I'd like to find a way to extend these patterns, and I was thinking about an AI tool that takes one of these patterns and extends it in all directions surrounding the original…

Nicola Lepetit
- 203
- 1
- 8
7
votes
1 answer
How many training data is required for GAN?
I'm beginning to study and implement GAN to generate more datasets. I'll just try to experiment with state-of-the-art GAN models as described here https://paperswithcode.com/sota/image-generation-on-cifar-10.
The problem is I don't have a big…

gameon67
- 215
- 3
- 12
6
votes
1 answer
How does AI 'see' the images it generates- from what perspective?
I've been using AI image generation for a while now, and I've noticed how profoundly AI doesn't seem to see the image as a whole, sometimes generating an image with parts of fingers floating near objects supposed to be being held, VERY warped…

ben svenssohn
- 316
- 1
- 10
6
votes
0 answers
How can an Artificial Intelligence system be ethically trained to generate art?
There have been a lot of popular AI-generating image systems put out recently, with such systems as Midjourney and Dall-E catching attention with how well put-together many of the automatically generated images are.
However, there has been a lot of…

Mithical
- 2,885
- 5
- 27
- 39
6
votes
2 answers
What is the exact role of model $p_\theta$ in diffusion models for the reverse process?
I'm reading this interesting blog post explaining diffusion probabilistic models and trying to understand the following.
In order to compute the reverse process, we need to consider the posterior distribution $q(\textbf{x}_{t-1} | \textbf{x}_t)$…

James Arten
- 297
- 1
- 8
5
votes
1 answer
Does MMD-VAE solve the problem of blurred images of vanilla VAEs?
I understand that with vanilla VAEs, there are a few reasons justifying the production of blurred out images. The InfoVAE paper describes the case when the decoder is flexible enough to ignore the latent attributes and generate an averaged out image…

Ananda
- 148
- 9
5
votes
1 answer
Context-based gap-fill face posture-mapper GAN
These images are handmade, not auto-generated like they will be in production. Apologies for inaccuracies in the graph overlay.
I am trying to build an AI like that displayed in the diagram: when given a training set of images with their…

Geza Kerecsenyi
- 51
- 6
4
votes
1 answer
What kind of algorithm is used by StackGAN to generate realistic images from text?
What kind of algorithm is used by StackGAN to generate realistic images from text? How does StackGAN work?

Aneesh bhat
- 43
- 4
3
votes
3 answers
Is the output of image generation models like Midjourney and Stable Diffusion deterministic?
Assuming the user can set all parameters, including but not limited to the seed.
Is the output deterministic? As in, the same set of inputs will create the same image?

Mindwin Remember Monica
- 131
- 2
3
votes
1 answer
Why can't AI image generators output verbatim text when prompted to do so?
I want to create a splash screen that includes the name of my project. DALL-E 2 changed some of the letters in the name, even when I tried putting the name of my project in double-quotes (").
Other prompts to create images with short verbatim text,…

Silver Sagely
- 133
- 1
- 5
3
votes
2 answers
Image-in image-out neural network architectures
With an RGB image of a paper sheet with text, I want to obtain an output image which is cropped and deskewed. Example of input:
I have tried non-AI tools (such as openCV.findContours) to find the 4 corners of the sheet, but it's not very robust in…

logijaz
- 49
- 1
- 4
3
votes
0 answers
Best Machine Learning Model for "Predicted" Image Generation
I am currently working on undergraduate research to determine hotspots for hand-surface contact. Ideally, I would like to give the model a depth image as input:
Example of synthetic depth image
and return an image mask indicating where the surface…

GB-DEV
- 31
- 1
3
votes
1 answer
What is the state-of-the-art algorithm for neural style transfer?
I've read the paper A Neural Algorithm of Artistic Style by Gatys et. al. and I find the application of neural style transfer very fun.
I also read that Exploring the structure of a real-time, arbitrary neuralartistic stylization network by Ghiasi…

DeepNet
- 31
- 2
3
votes
1 answer
How can I generate unique random patterns (similar to the ones in Nutella jars)?
How can I generate unique patterns, as they did for these Nutella jars? See, for example, the video Algorithm designs seven million different jars of Nutella.

Divyansh Gupta
- 131
- 3