Does the concept of validation loss apply to training deep Q networks?

Asked May 18 '20 at 12:54

Active May 18 '20 at 13:27

Viewed 227 times

In deep learning, the concept of validation loss is to ensure that the model being trained is not currently overfitting the data. Is there a similar concept of overfitting in deep q learning?

Given that I have a fixed number of experiences already in a replay buffer and I train a q network by sampling from this buffer, would computing the validation loss (separate from the experiences in the replay buffer) help me to decide whether I should stop training the network?

For example, If my validation loss increases even though my train loss continues to decrease, I should stop training the training. Does deep learning validation loss also apply in the deep q network case?

Just to clarify again, no experiences are collected during the training of the DQN.

edited May 18 '20 at 13:27

nbro

39,006
12
98
176

asked May 18 '20 at 12:54

calveeen

1,251
7
17

This question may be helpful [How can I handle overfitting in reinforcement learning problems?](https://ai.stackexchange.com/q/20127/2444). – nbro May 18 '20 at 13:28

Does the concept of validation loss apply to training deep Q networks?

0 Answers0