0

I have implemented a GRU to deal with youtube comment data. I am a bit confused about why the validation score seems to even out around 70% and then keeps rising, this doesn't look like overfitting from what I'm used to since it keeps rising. Is this normal? Does anyone know what's happening here?

I've implemented the GRU as a simple GRU, then GRU with dropout and then a multilayer (2 layered) GRU with dropout. The graphs are shown below.

enter image description here

enter image description here

enter image description here

As you can see, in the multilayer GRU with dropout , the validation accuracy does nicely follow the training accuracy. Does this simply mean that the other models are simply not capable of capturing certain information? Is there a way to improve these results, what parameters should be optimized?

desertnaut
  • 1,005
  • 10
  • 19
nibs
  • 1
  • 2

1 Answers1

1

Since all networks' accuracy goes close to 100%, I would argue that all of the models are capable of learning this task. But the first two models are somewhat overfitting, since the validation accuracy doesn't get nearly as high. Granted the 2nd model uses dropout, but it seems that the dropout rate is not enough to bring these two accuracy metrics closer together.

Does the 3rd model have more trainable parameters than the first two? I don't know why it would have so much higher validation accuracy than the first two.

It is also worth repeating these experiments at least five times, and check if you see this pattern always or was it just a random occurrence.

NikoNyrh
  • 642
  • 4
  • 8