For questions related to the convolution operation in mathematics, convolutional neural networks, image processing and computer vision.
Questions tagged [convolution]
89 questions
22
votes
2 answers
Why would you implement the position-wise feed-forward network of the transformer with convolution layers?
The Transformer model introduced in "Attention is all you need" by Vaswani et al. incorporates a so-called position-wise feed-forward network (FFN):
In addition to attention sub-layers, each of the layers in our encoder
and decoder contains a…

Eli Korvigo
- 321
- 2
- 6
14
votes
2 answers
What is the difference between graph convolution in the spatial vs spectral domain?
I've been reading different papers regarding graph convolution and it seems that they come into two flavors: spatial and spectral. From what I can see the main difference between the two approaches is that for spatial you're directly multiplying the…

razvanc92
- 1,108
- 7
- 18
10
votes
2 answers
When should I use 3D convolutions?
I am new to convolutional neural networks, and I am learning 3D convolution. What I could understand is that 2D convolution gives us relationships between low-level features in the X-Y dimension, while the 3D convolution helps detect low-level…

Shobhit Verma
- 161
- 1
- 7
10
votes
1 answer
How can the convolution operation be implemented as a matrix multiplication?
How can the convolution operation used by CNNs be implemented as a matrix-vector multiplication? We often think of the convolution operation in CNNs as a kernel that slides across the input. However, rather than sliding this kernel (e.g. using…

nbro
- 39,006
- 12
- 98
- 176
8
votes
2 answers
What is the point of using 1D and 2D convolutions with a kernel size of 1 and 1x1 respectively?
I understand the gist of what convolutional neural networks do and what they are used for, but I still wrestle a bit with how they function on a conceptual level. For example, I get that filters with kernel size greater than 1 are used as feature…

Arcturai
- 81
- 1
7
votes
3 answers
Does each filter in each convolution layer create a new image?
Say I have a CNN with this structure:
input = 1 image (say, 30x30 RGB pixels)
first convolution layer = 10 5x5 convolution filters
second convolution layer = 5 3x3 convolution filters
one dense layer with 1 output
So a graph of the network will…

RocketNuts
- 205
- 2
- 6
6
votes
1 answer
Are there any advantages of the local attention against convolutions?
Transformer architectures, based on the self-attention mechanism, have achieved outstanding performance in a variety of applications.
The main advantage of this approach is that the given token can interact with any token in the input sequence and…

spiridon_the_sun_rotator
- 2,454
- 8
- 16
6
votes
1 answer
What is the difference between asymmetric and depthwise separable convolution?
I have recently discovered asymmetric convolution layers in deep learning architectures, a concept which seems very similar to depthwise separable convolutions.
Are they really the same concept with different names? If not, where is the difference?…

Pierre Gramme
- 163
- 4
5
votes
2 answers
Can CNNs be applied to non-image data, given that the convolution and pooling operations are mainly applied to imagery?
When using CNNs for non-image (times series) data prediction, what are some constraints or things to look out for as compared to image data?
To be more precise, I notice there are different types of layers in a CNN model, as described below, which…

nilsinelabore
- 241
- 2
- 12
5
votes
1 answer
Is it possible to vectorise a CNN?
I am trying to write a CNN from scratch and am wondering if it is possible to vectorize the convolution step.
For example, if I had a dataset of 500 RGB images of size 32x32x3, and wanted the first convolutional layer to have 64 filters, how would I…

FeedMeInformation
- 327
- 1
- 7
5
votes
1 answer
What are the benefits of using max-pooling in convolutional neural networks?
I am reading Francois Chollet's Deep learning with Python, and I came across a section about max-pooling that's really giving me trouble.
I am unable to copy-paste the content, so I've included screenshots of the paragraph that's troubling me.
I…

An Ignorant Wanderer
- 191
- 1
- 4
4
votes
2 answers
How is the depth of the input related to the depth of the output of a convolutional layer?
Let's suppose I have an image with 16 channels that goes to a convolutional layer, which has 3 trainable $7 \times 7$ filters, so the output of this layer has depth 3.
How does the convolutional layer go from 16 to 3 channels? What mathematical…

Du Bois Eloi
- 43
- 4
4
votes
1 answer
Is there anything that ensures that convolutional filters don't end up the same?
I trained a simple model to recognize handwritten numbers from the mnist dataset. Here it is:
model = Sequential([
Conv2D(filters=1, kernel_size=(3,1), padding='valid', strides=1, input_shape=(28, 28, 1)),
Flatten(),
Dense(10,…

mark mark
- 753
- 4
- 23
4
votes
1 answer
When should we use separable convolution?
I was reading the "Deep Learning with Python" by François Chollet. He mentioned separable convolution as following
This is equivalent to separating the learning of spatial features and
the learning of channel-wise features, which makes a lot of…

Enes
- 304
- 3
- 11
4
votes
2 answers
Do convolutional neural networks perform convolution or cross-correlation?
Typically, people say that convolutional neural networks (CNN) perform the convolution operation, hence their name. However, some people have also said that a CNN actually performs the cross-correlation operation rather than the convolution. How is…

nbro
- 39,006
- 12
- 98
- 176