Questions tagged [dense-layers]

21 questions
5
votes
1 answer

How to add a dense layer after a 2d convolutional layer in a convolutional autoencoder?

I am trying to implement a convolutional autoencoder with a dense layer at the bottleneck to do some dimensional reduction. I have seen two approaches for this, which aren't particularly scalable. The first was to introduce 2 dense layers (one at…
2
votes
1 answer

What gets optimized in convolutional neural network?

In a convolutional neural network, the hyperparameters such as number of kernels and stride, kernel size, etc are determined. After some combination of convolutions, ReLU and pooling layer there is the fully connected (FC) layer in the end which…
2
votes
1 answer

What's the purpose of layers without biases?

I noticed that the TensorFlow library includes a use_bias parameter for the Dense layer, which is set to True by default, but allows you to disable it. At first glance, it seems unfavorable to turn off the biases, as this may negatively affect data…
2
votes
3 answers

Why does a neuron in a multi-layer network need several input connections?

For example, if I have the following architecture: Each neuron in the hidden layer has a connection from each one in the input layer. 3 x 1 Input Matrix and a 4 x 3 weight matrix (for the backpropagation we have of course the transformed version 3…
2
votes
1 answer

How do you go from the last convolutional layer to your first fully connected layer?

I'm implementing a neural network framework from scratch in C++ as a learning exercise. There is one concept I don't see explained anywhere clearly: How do you go from your last convolutional or pooling layer, which is 3 dimensional, to your first…
1
vote
1 answer

Number of units in Final softmax layer in VGGNet16

I am trying to implement and train neural network model VGGNet from scratch, on my own data. I am reproducing all the layers of the model. I am having a confusion about the last, fully connected softmax layer. In the research paper by Simonyan and…
1
vote
1 answer

Why in Multi-Head Attention implementation should we use $3$ linear layers for Q, K, V instead of $3 * h$ layers?

I have been trying to implement a Transformer architecture using PyTorch by following the Attention Is All You Need paper as well as the The Annotated Transformer blog post to compare my code with theirs. And I noticed that in their implementation…
Daviiid
  • 563
  • 3
  • 15
1
vote
0 answers

About RNN followed by dense layer

In RNN we do get one output for each time stamp of input right i.e. if we give input as "I am Good" we get three outputs representing I followed by am and Good so if we connect a dense layer followed by RNN layer does it just connect it with the…
1
vote
1 answer

Why do transformer Key Query Value layers not have biases or activations?

Transformers use just matrices to transform input embeddings, which is halfway to being a connected dense layer (add a bias and activation). So, why don't transformers have dense layers for encoding input into Query Key Value?
Artin Kim
  • 11
  • 2
1
vote
1 answer

Does a second-order fully-connected layer have any uses?

I was thinking about implementing second-order regression via a fully-connected layer, and I came up with this: $X$ is the input data, shaped $(features, batch\_number)$. $w0$ is the bias, shaped $(output\_dims,)$. $w1$ and $w2$ are the weights,…
1
vote
0 answers

Are there any benefits of adding attention to linear layers?

Is attention useful only in transformer/convolution layers? Can I add it to linear layers? If yes, how (on a conceptual level, not necessarily the code to implement the layers)?
pentavol
  • 13
  • 4
1
vote
1 answer

Why does the output shape of a Dense layer contain a batch size?

I understand that the batch size is the number of examples you pass into the neural network (NN). If the batch size is 10, it means you feed the NN 10 examples at once. Assuming I have an NN with a single Dense layer. This Dense layer of 20 units…
1
vote
1 answer

Can fully connected layers be used for feature detection?

I need help in understanding something basic. In this video, Andrew Ng says, essentially, that convolutional layers are better than fully connected (FC) layers because they use fewer parameters. But I'm having trouble seeing when FC layers…
1
vote
0 answers

When to use convolutional layers as opposed to fully connected layers?

I am still new to CNNs, but I would like to check my understanding between when to use convolutional layers versus fully connected layers. From what I have read, we can use convolutional layers with filters, rather than fully connected layers, with…
0
votes
0 answers

How can validation accuracy be more than test accuracy?

I have been trying to implement DenseNet on small dataset using k-fold cross validation. Training accuracy is 94% ,validation accuracy is 73% whereas test accuracy is 90%.I have taken 10% of my total dataset as test set. I know some overfitting is…
1
2