What kind of problems is DQN algorithm good and bad for?

Question

I know this is a general question, but I'm just looking for intuition. What are the characteristics of problems (in terms of state-space, action-space, environment, or anything else you can think of) that are well solvable with the family of DQN algorithms? What kind of problems are not well fit for DQNs?

nbro · Accepted Answer · 2020-11-22T20:03:13.030

I don't currently have much practical experience with DQN, but I can partially answer this question also based on my theoretical knowledge and other info that I found.

DQN is typically used for

discrete action spaces (although there have been attempts to apply it to continuous action spaces, such as this one)
discrete and continuous state spaces
problems where the optimal policy is deterministic (an example where the optimal policy is not deterministic is rock-paper-scissors)
off-policy learning (Q-learning is an off-policy algorithm, but the point is that, if you have a problem/application where data can only be or has been gathered by a policy that is unrelated to the policy you want to estimate, then DQN is suitable, though there are other off-policy algorithms, such as DDPG)

This guide also states that DQN is slower to train, but more sample efficient than other approaches, due to the use of the experience replay buffer.

Moreover, if you have a small state and action space, it is probably a good idea to just use tabular Q-learning (i.e. no function approximation), given that it is guaranteed to converge to the optimal value function.

See also this and this questions and this article (which compares DQN with policy gradients).

What kind of problems is DQN algorithm good and bad for?

1 Answers1