Why do DQNs tend to forget? Is it because when you feed highly correlated samples, your model (function approximation) doesn't give a general solution?
For example:
I use level 1 experiences, my model $p$ is fitted to learn how to play that level.
I go to level 2, my weights are updated and fitted to play level 2 meaning I don't know how to play level 1 again.