3

In deep learning, the concept of validation loss is to ensure that the model being trained is not currently overfitting the data. Is there a similar concept of overfitting in deep q learning?

Given that I have a fixed number of experiences already in a replay buffer and I train a q network by sampling from this buffer, would computing the validation loss (separate from the experiences in the replay buffer) help me to decide whether I should stop training the network?

For example, If my validation loss increases even though my train loss continues to decrease, I should stop training the training. Does deep learning validation loss also apply in the deep q network case?

Just to clarify again, no experiences are collected during the training of the DQN.

nbro
  • 39,006
  • 12
  • 98
  • 176
calveeen
  • 1,251
  • 7
  • 17
  • This question may be helpful [How can I handle overfitting in reinforcement learning problems?](https://ai.stackexchange.com/q/20127/2444). – nbro May 18 '20 at 13:28

0 Answers0