0

I am a beginner with DL. I did some tutorials and I know the basics of TensorFlow. But I have a problem understanding how to construct more advanced NNs.

Let's say I have 6 inputs and a list of 500 names from which you can pick any, but only 6 at the time. The output should be one value between $0.0$ and $1.0$.

My question is, how I can handle random order in inputs? In inputs 1-6 you have names ABCDEF and the output score is 0.7. I need the same output if input will be in order CEDBFA. How can I handle this? Should I make random shuffle on inputs during training, or should I make for every output value 500D binary vector-like $[0,0,1,0,1,...,0,1,0,0,0]$, where index position in the array is the corresponding token of name and then feed it in 500 inputs? Or there is some better way?

nbro
  • 39,006
  • 12
  • 98
  • 176
tech2097
  • 1
  • 1

1 Answers1

0

Note: This is always going to be an estimate until you actually run the experiment. ML is not always predictable.

If order truly does not matter, then I think it will be better to design a network architecture that automatically ignores order, instead of using one that cares about order and then training it to ignore order. If nothing else, less training data will be needed since you don't need to train it on permutations of the input - similar to why CNNs are useful for image recognition.

One network architecture could be to have an "input processing block" (a group of layers) and an "output processing block". First apply the input processing block to each input. Then add together all the outputs of the input processing blocks. Because addition is insensitive to order, this step completely discards any information about the order. Finally, apply the output processing block to the sum of those and the output from that block is your final output.

           output
             ^
             |
           +--+
           |NN| (different NN to the one below)
           +--+
             ^
             |
+---------------------------+
| add all together          |
+---------------------------+
 ^    ^    ^    ^    ^    ^
 |    |    |    |    |    |
+--+ +--+ +--+ +--+ +--+ +--+
|NN| |NN| |NN| |NN| |NN| |NN| (same NN 6 times)
+--+ +--+ +--+ +--+ +--+ +--+
 ^    ^    ^    ^    ^    ^
 |    |    |    |    |    |
 C    E    D    B    F    A

This is just one idea. You are not limited to addition; you can use any commutative operation in the middle. Even something like an attention layer could be used.

user253751
  • 922
  • 3
  • 11