According to the U-Net architecture image from the second page of the research paper (URL link) https://arxiv.org/pdf/1505.04597.pdf
How does the skip connection match its dimension to the same layer in the expansive path?
According to the U-Net architecture image from the second page of the research paper (URL link) https://arxiv.org/pdf/1505.04597.pdf
How does the skip connection match its dimension to the same layer in the expansive path?
Output of each layer in the upscaling block is of the same size as the input of corresponding convolution layer in the downscaling block after cropping the input's feature maps.
This is how the network is defined. Each conv layer in the downscaling block has a corresponding layer in upscaling block to which the skip connection is made to.Except for the layer in the middle(sometimes called the latent layer). This is the layer that separates the downscaling block from upscaling block as seen in the original paper.
So in short it's just the way network is designed. I doesn't use the whole feature map from corresponding layer in down sampling block.
For a reference you can see Ternaus Net. There they had to crop randomly to support the VGG encoder in the UNet structure.