Why is it hard to prove the convergence of the deep Q-learning algorithm?

Asked May 10 '20 at 16:01

Active Mar 24 '21 at 11:02

Viewed 1,347 times

Why is it hard to prove the convergence of the DQN algorithm? We know that the tabular Q-learning algorithm converges to the optimal Q-values, and with a linear approximator convergence is proved.

The main difference of DQN compared to Q-Learning with linear approximator is using DNN, the experience replay memory, and the target network. Which of these components causes the issue and why?

edited Mar 23 '21 at 09:08

nbro

39,006
12
98
176

asked May 10 '20 at 16:01

Afshin Oroojlooy

2

See [Why doesn't Q-learning converge when using function approximation?](https://ai.stackexchange.com/q/11679/2444). – nbro May 10 '20 at 16:06
Thanks for the link. I carefully read the post, it does not actually answer my question. – Afshin Oroojlooy May 10 '20 at 23:25

Why is it hard to prove the convergence of the deep Q-learning algorithm?

0 Answers0

Linked