2

I have been doing a course which teaches you about Deep Neural Networks, during one of the exercises I was made to make an RNN for sentiment classification which I did, but I did not understand how an RNN is able to deal with sentences of different lengths while conducting sentiment classification.

1 Answers1

1

One of the essential pre-processing we do on the corpus involves treating the variable-length sentences to a fixed length. There are various ways in which we can do this:

Truncate


This involves reducing the length of all the sentences to the length of the shortest sentence in the corpus. This is generally not done as it reduces the amount of information that we can learn from the corpus. This image shows pre sequence truncation, where we remove from the back to make the sentences of the same length.

Truncate Example

Padding


This is the most preferred method when it comes to handling the problem of variable length sentences. In this approach, we increase the size of each vector to the longest sentence in the corpus. There are two ways to this:

  • Post-Padding: Adding zeroes in the ending
  • Pre-Padding: Adding zeroes in the beginning

References


Effect of Padding on LSTMs and CNNs by Dwarampudi Mahidhar Reddy and N V Subba Reddy, et al.

Saurav Maheshkar
  • 756
  • 1
  • 7
  • 20