I am trying to build a DQN model for the Atari Pong game, but I am not sure whether the model is learning at all.
I am using the architecture described in the paper Playing Atari with Deep Reinforcement Learning. And I tested the model on a simpler environment (like CartPole), which worked great, but I am not seeing any progress at all with Pong, I have been training the model for 2-3 hours and its performance is no better than taking random actions.
Should I just keep waiting or there might be something wrong with my code. Around how many episodes should it take before I see some positive results?