6

I am interested in the possibility of having extra input along with the main data. For instance, a medical application that would rely mostly on an image: how could one also account for sex, age, etc.?

It is certainly possible to put the output of a CNN and other data into, say, a densely connected network; but it seems inefficient. Are there well-established ways of doing something like this?

nbro
  • 39,006
  • 12
  • 98
  • 176

1 Answers1

4

A more efficient way would be creating a multi input model, with something like this:

___________    _____________
|__Image__|    |Other input|            
_____|_____     _____|_____
|___CNN___|     |__Dense___|
_____|______    _____|______
|_Features1_|   |_Features2_|
         __|_____|__
         |__Merge___|
         _____|______
         |___Dense__|
         _____|_____
         |__Output__|

However, you could also combine the unstructured data to the image, as stated in the quora answer:

The out-of-the-box method:

If you want to just take your CNN library and use it without much thought, there's an easy way to do it.

Your image has “channels”: red blue and green channels, for example. Just add another channel for each unstructured feature. Those channels will just be 2D-arrays whose entries are all the same value: the value of your outside feature.

It means more memory and more parameters though. If you have a lot of unstructured data, this can become prohibitively expensive.

The more efficient method (and still not hard):

You use one or more deconvolutional filters to bring the unstructured data up to the size of the structured data, concatenate them along the channel dimension, and keep going as if nothing happened.

Source: How can l train a CNN with extra features other than the pixels? (Quora)

DukeZhou
  • 6,237
  • 5
  • 25
  • 53
Clement
  • 1,725
  • 7
  • 24