Questions tagged [vgg]

For questions related to the VGG neural networks, which were proposed in "Very Deep Convolutional Networks for Large-Scale Image Recognition" (2015) by Karen Simonyan and Andrew Zisserman.

12 questions
5
votes
2 answers

How do I improve accuracy and know when to stop training?

I am training a modified VGG-16 to classify crowd density (empty, low, moderate, high). 2 dropout layers were added at the end on the network each one after one of the last 2 FC layers. network settings: training data contain 4381 images…
4
votes
1 answer

Trying to understand VGG convolution neural networks architecture

Trying to understand the VGG architecture and I have these following questions. I understand the general understanding of increasing filter size is because we are using max pooling and so its image size gets reduced. So in order to keep information…
2
votes
2 answers

Why does the number of feature maps increases in the VGG model?

I found the below image of how a CNN works But I don't really understand it. I think I do understand CNNs, but I find this diagram very confusing. My simplified understanding: Features are selected Convolution is carried out so that to see…
1
vote
1 answer

Number of units in Final softmax layer in VGGNet16

I am trying to implement and train neural network model VGGNet from scratch, on my own data. I am reproducing all the layers of the model. I am having a confusion about the last, fully connected softmax layer. In the research paper by Simonyan and…
1
vote
0 answers

Inference time of VGG16 when initialised with different weights

I’m trying to understand the differences in inference time and training time between two models: VGG16 with weights initialised from a Glorot uniform distribution and the same network with the only difference being that weights are initialised to…
1
vote
1 answer

How does a VGG-based Style-Loss incorporate color information?

I've recently been reading a lot about style transfer, its applications and implications. I understand what the Gram matrix is and does. I can program it. But one thing that has been boggling me is: how does the VGG style loss incorporate color…
1
vote
2 answers

Does replacing 3x3 filters with 3x1 and 1x3 filters improve the performance?

Recently I have come up with a VGG16 model for my binary classification task. I have relatively simple signal images Therefore (maybe?) other deeper models like resnet18 and Inceptionv3 were not as good. As known, VGG uses 3x3 filters for…
bit_scientist
  • 241
  • 1
  • 4
  • 15
1
vote
3 answers

Is a VGG-based CNN model sometimes better for image classfication than a modern architecture?

I have an image classification task to solve, but based on quite simple/good terms: There are only two classes (either good or not good) The images always show the same kind of piece (either with or w/o fault) That piece is always filmed from the…
0
votes
3 answers

How are the dimensions of the feature maps produced by the convolutional layer determined in VGG-16?

I'm trying to understand how the dimensions of the feature maps produced by the convolution are determined in a ConvNet. Let's take, for instance, the VGG-16 architecture. How do I get from 224x224x3 to 112x112x64? (The 112 is understandable, it's…
0
votes
1 answer

Why the partial derivative is $0$ when $F_{ij}^l < 0$?. Math behind style transfer

I am currently in the process of reading and understanding the process of style transfer. I came across this equation in the research paper which went like - For context, here is the paragraph - Generally each layer in the network defines a…
0
votes
0 answers

Strategy to input and get large images in VGG neural networks

I'm using a transfert-style based deep learning approach that use VGG (neural network). The latter works well with images of small size (512x512pixels), however it provides distorted results when input images are large (size > 1500px). The author of…
-1
votes
1 answer

Denoise autoencoder not training properly

I'm trying to make a denoise autoencoder wherein the encoder part is vgg16 and decoder is opposite of vgg16(encoder) network. My dataset consists of 5K images in grayscale. Now while training, the loss and accuracy doesn't changes. I can think of…