3

I'm trying to understand how the size of the hidden state affects the GRU.

For example, suppose I want to make a GRU count. I'm gonna feed it with three numbers, and I expect it to predict the fourth.

How should I choose the size of the hidden state of a GRU?

nbro
  • 39,006
  • 12
  • 98
  • 176
razvanc92
  • 1,108
  • 7
  • 18

1 Answers1

3

Yes, your understanding of the hidden state is correct. But the size of the hidden state is a hyperparameter that needs to found by trial-and-error. There is no closed-form formula or solution which links the size of the hidden state and the problem at hand. But, there are some rules of thumb like to start out with the size of the hidden state to be a power of 2, etc. Keep tuning the hyperparameter until you get very good predictions.

nbro
  • 39,006
  • 12
  • 98
  • 176
varsh
  • 562
  • 7
  • 19