0

I'm working on the code trying to generate new images using DCGAN model. The structure of my code is from the PyTorch tutorial here. I'm a bit confused trying to find and understand how the latent vector is transforming to the feature map from the Generator part (this line of code is what I'm interested in) :

nn.ConvTranspose2d(  nz, ngf * 8, 4, 1, 0, bias=False)

It means the latent vector (nz) of shape 100x1 transforming to the 512 matrices of 4x4 size (ngf=64). How does it happen? Hence, I can't even clear to myself how the length of the latent vector influence to the generated image. P.S. The left part of the Genarator structure is clear.

The only idea that I got is :

  1. E.g. there is a latent vector of size 100 as input (100 random values).
  2. We interact each value of the input latent vector with a 4x4 kernel.
  3. In this way we get 100 different 4x4 matrices (each matrix is for one value from latent vector).
  4. Then we summarize all these 100 matrices and get one final matrix - one feature map.
  5. We get necessary number of feature maps taking different kernels.

Is this right? Or does it happen in other way?

CapJS
  • 3
  • 3

1 Answers1

0

Noise vector (batch_size, 100, 1, 1) is deconvoloved with filter_1 (100, 4, 4). Result is feature_map_1 (1, 4, 4). And since there are 512 filters, so there will be 512 feature maps. Output shape will be (batch_size, 512, 4, 4).

I think you need better undersanting for convolutional calculations in general. In this stack it was explained very well.

Enes
  • 304
  • 3
  • 11