5

I have a fully connected network that takes in a variable-length input padded with 0.

However, the network doesn't seem to be learning and I am guessing that the high number of zeros in the input might have something to do with that.

Are there solutions for dealing with padded input in fully connected layers or should I consider a different architecture?

UPDATE (to provide more details):

The goal of the network if to clean full file paths: i.g.:

  • /My document/some folder/a file name.txt > a file name
  • /Hard drive/book/deeplearning/1.txt > deeplearning

The constraint is that the training data labels have been generated using a regex on the file name itself so it's not very accurate.

I am hoping that by treating every word equally (without sequential information) the network would be able to generalize as to which type of word is usually kept and which is usually discarded.

Then network takes in a sequence of word embedding trained on paths data and outputs logits that correspond to probabilities of each word to be kept or not.

nbro
  • 39,006
  • 12
  • 98
  • 176
silkAdmin
  • 209
  • 1
  • 3
  • There are a few choices of re-representation or different architectures, but the best choice will depend on details of the data. Please describe more about your input data, to help readers understand what the options might be. For instance, is the data from a sequence (text or signal processing), does it contain a variable number of equivalent items? A few simplified examples of the the data may help. – Neil Slater Mar 07 '18 at 08:20
  • @NeilSlater I've updated the question with additional info – silkAdmin Mar 07 '18 at 08:44

1 Answers1

1

If you use fully connected neural network, it will be hard to do what you want. There's no inductive bias to easily generalize from different positions of the strings which you need to extract.

I suggest you use Seq2Seq models here: http://cs224d.stanford.edu/papers/seq2seq.pdf