Questions tagged [dqn]

For questions related to the deep Q-network (DQN), which is a deep neural network (e.g. a convolutional neural network) trained with a variant of Q-learning. The expression was coined in the paper "Playing Atari with Deep Reinforcement Learning" (2013) by Google's DeepMind.

For more info, have a look at the paper Playing Atari with Deep Reinforcement Learning (2013), by V. Mnih et al. See also Human-level control through Deep Reinforcement Learning (2015), by V. Mnih et al., and Implementing the Deep Q-Network (2017) by M. Roderick et al.

335 questions
18
votes
2 answers

Can Q-learning be used for continuous (state or action) spaces?

Many examples work with a table-based method for Q-learning. This may be suitable for a discrete state (observation) or action space, like a robot in a grid world, but is there a way to use Q-learning for continuous spaces like the control of a…
18
votes
1 answer

How to deal with a huge action space, where, at every step, there is a variable number of legal actions?

I am working on creating an RL-based AI for a certain board game. Just as a general overview of the game so that you understand what it's all about: It's a discrete turn-based game with a board of size $n \times n$ ($n$ depending on the number of…
17
votes
1 answer

Why does DQN require two different networks?

I was going through this implementation of DQN and I see that on line 124 and 125 two different Q networks have been initialized. From my understanding, I think one network predicts the appropriate action and the second network predicts the target Q…
16
votes
2 answers

What is the difference between Q-learning, Deep Q-learning and Deep Q-network?

Q-learning uses a table to store all state-action pairs. Q-learning is a model-free RL algorithm, so how could there be the one called Deep Q-learning, as deep means using DNN; or maybe the state-action table (Q-table) is still there but the DNN is…
Dee
  • 1,283
  • 1
  • 11
  • 35
11
votes
1 answer

What exactly is the advantage of double DQN over DQN?

I started looking into the double DQN (DDQN). Apparently, the difference between DDQN and DQN is that in DDQN we use the main value network for action selection and the target network for outputting the Q values. However, I don't understand why…
Chukwudi
  • 349
  • 2
  • 7
10
votes
2 answers

Was DeepMind's DQN learning simultaneously all the Atari games?

DeepMind states that its deep Q-network (DQN) was able to continually adapt its behavior while learning to play 49 Atari games. After learning all games with the same neural net, was the agent able to play them all at 'superhuman' levels…
Dion
  • 203
  • 2
  • 6
8
votes
1 answer

Is Experience Replay like dreaming?

Drawing parallels between Machine Learning techniques and a human brain is a dangerous operation. When it is done successfully, it can be a powerful tool for vulgarisation, but when it is done with no precaution, it can lead to major…
16Aghnar
  • 591
  • 2
  • 10
8
votes
1 answer

What are other ways of handling invalid actions in scenarios where all rewards are either 0 (best reward) or negative?

I created an OpenAI Gym environment, and I would like to check the performance of the agent from OpenAI Baselines DQN approach on it. In my environment, the best possible outcome for the agent is 0 - the robot needs zero non-necessary resources to…
8
votes
1 answer

How is the DQN loss derived from (or theoretically motivated by) the Bellman equation, and how is it related to the Q-learning update?

I'm doing a project on Reinforcement Learning. I programmed an agent that uses DDQN. There are a lot of tutorials on that, so the code implementation was not that hard. However, I have problems understanding how one should come up with this kind of…
8
votes
2 answers

What are some online courses for deep reinforcement learning?

What are some (good) online courses for deep reinforcement learning? I would like the course to be both programming and theoretical. I really liked David Silver's course, but the course dates from 2015. It doesn't really teach deep Q-learning at…
8
votes
2 answers

Can DQN perform better than Double DQN?

I'm training both DQN and double DQN in the same environment, but DQN performs significantly better than double DQN. As I've seen in the double DQN paper, double DQN should perform better than DQN. Am I doing something wrong or is it possible?
Angelo
  • 201
  • 2
  • 16
7
votes
2 answers

How to combine backpropagation in neural nets and reinforcement learning?

I have followed a course on machine learning, where we learned about the gradient descent (GD) and back-propagation (BP) algorithms, which can be used to update the weights of neural networks, and reinforcement learning, in particular, Q-learning. I…
7
votes
2 answers

Are policy gradient methods good for large discrete action spaces?

I have seen this question asked primarily in the context of continuous action spaces. I have a large action space (~2-4k discrete actions) for my custom environment that I cannot reduce down further: I am currently trying DQN approaches but was…
user9317212
  • 161
  • 2
  • 10
7
votes
1 answer

What happens when you select actions using softmax instead of epsilon greedy in DQN?

I understand the two major branches of RL are Q-Learning and Policy Gradient methods. From my understanding (correct me if I'm wrong), policy gradient methods have an inherent exploration built-in as it selects actions using a probability…
7
votes
2 answers

Why are reinforcement learning methods sample inefficient?

Reinforcement learning methods are considered to be extremely sample inefficient. For example, in a recent DeepMind paper by Hessel et al., they showed that in order to reach human-level performance on an Atari game running at 60 frames per second…
rrz0
  • 263
  • 2
  • 7
1
2 3
22 23