7

While reading the DQN paper, I found that randomly selecting and learning samples reduced divergence in RL using a non-linear function approximator (e.g a neural network).

So, why does Reinforcement Learning using a non-linear function approximator diverge when using strongly correlated data as input?

nbro
  • 39,006
  • 12
  • 98
  • 176
강문주
  • 71
  • 2
  • Read chapter 11 of [this](http://incompleteideas.net/book/bookdraft2017nov5.pdf) book. This is only a draft, if you can find the full book even better. Also, I think similar questions were answered already so try searching a bit through the website. – Brale Feb 11 '20 at 08:19
  • Maybe this is a duplicate of [Why doesn't Q-learning converge when using function approximation?](https://ai.stackexchange.com/q/11679/2444). – nbro Feb 11 '20 at 16:05

1 Answers1

2

It is not so much the problem of using Reinforcement Learning to train the neural networks, it is the assumptions made about the data given to standard Neural Networks. They are not capable of handling strongly correlated data which is one of the motivations for introducing Recurrent Neural Networks, as they can handle this correlated data well.

David
  • 4,591
  • 1
  • 6
  • 25