1

Lately, I have implemented DQN for Atari Breakout. Here is the code:

https://github.com/JeremieGauthier/AI_Exercices/blob/master/Atari_Breakout/DQN_Breakout.py

I have trained the agent for over 1500 episodes, but the training leveled off at around 5 as score. Could someone look at the code and point out a few things that should be corrected?

enter image description here

Actually, the training is not going more than 5 in average score. Is there a way to improve my performance?

jgauth
  • 161
  • 10
  • Have you look at the original DQN paper? How long have they trained their network for? – nbro Apr 10 '20 at 14:41
  • Yeah, I have seen the original paper. I trained it for 3 to 5 hours – jgauth Apr 10 '20 at 14:42
  • And how long have the original authors of DQN trained their network for? Probably a lot more than 3-5 hours. And in terms of episodes? – nbro Apr 10 '20 at 14:43
  • ''One epoch corresponds to 50000 minibatch weight updates or roughly 30 minutes of training time''. They achieved average reward of 1800, and in my case I get at most 5. Where am I wrong? Will I achieve 1600 or 1700 in average reward if I wait long enough. Right now, the code have some difficulties to just break through 5 in average reward – jgauth Apr 10 '20 at 14:55
  • What is the batch size they used? You're using a batch size of `batch_size = 256`, according to your code. Just to make sure, are you using their exact configuration of the hyper-parameters? – nbro Apr 10 '20 at 14:58
  • @nbro It seems it is not indicated. Am I wrong? I tried to replicate the code as it is indicated – jgauth Apr 10 '20 at 15:02
  • 1
    It's been a long time since I read that paper, but I was planning to read it again in the next days. Maybe I will have a look at it later. – nbro Apr 10 '20 at 15:36
  • Did the URL move? getting 404 from github. @jgauth – Yaniv Peretz Apr 27 '23 at 13:52

0 Answers0