Which deep learning models are suitable for image-to-image mapping?

Question

I am working on a problem in which I need to train a neural network to map one or more input images to one or more output images (1 channel for image). Below I report some examples of input&output. In this case I report 1 input and 1 output image, but may need to pass to more inputs and outputs, maybe by encoding this in channels. However, the images are all of this kind, maybe rotated, tralated or changed a bit in shape. (fyi, they are fields defined by fluid dynamics simulations)

I was thinking about CNN, but the standard architecture used for image classification (convolutional layers + fully connected layers) seems not to be the best choice. Instead, I tried using the U-net architecture, composed of compression+decompression convolutional layers. This works quite fine, but maybe there is some other architecture that could be more suited to my problem.

Any suggestion would be appreciated!

did you figured out what worked well. I am kind of working on the similar problem. If will be great if you can share any ideas. Thank you — parth modi, Nov 30 '22 at 10:45

score 2 · Answer 1 · answered Jan 26 '20 at 03:15

2

Since you have already tried U-Net. You may look into Siamese Networks (with CNNs for images), they are very well known for computing similarity via deep learning. This is a central idea and can be performed with both text and images. As a tip, you may be able to leverage a lot of architecture from U-Net to Siamese.

Hope it helps, Some useful links to start with :

answered Jan 26 '20 at 03:15

HARSH NILESH PATHAK

21
2

Hi, thank you for the tip. However I don't see how I could use the siamese architecture in my problem, I have only one image from which I should output another image, I am not comparing anything – Giulio Ortali Jan 27 '20 at 07:59
It will be great if you can provide more information about your dataset. (train/test). Another suggestion could convert your U-NET to VAE (variational autoencoder). Using this you can generate more images given an image. Let me know if this helps I can edit the answer above. Training setup can be of 2 types with VAE (1)unsupervised and (2)supervised. Usually, its known to be trained in unsupervised fashion but in your case supervised could be better. Again, this raises, why this is different from U-NET?- I think because you can control the variation. But we can observe results empirically. – HARSH NILESH PATHAK Jan 28 '20 at 04:31

Which deep learning models are suitable for image-to-image mapping?

1 Answers1