Questions tagged [dqn]

For questions related to the deep Q-network (DQN), which is a deep neural network (e.g. a convolutional neural network) trained with a variant of Q-learning. The expression was coined in the paper "Playing Atari with Deep Reinforcement Learning" (2013) by Google's DeepMind.

For more info, have a look at the paper Playing Atari with Deep Reinforcement Learning (2013), by V. Mnih et al. See also Human-level control through Deep Reinforcement Learning (2015), by V. Mnih et al., and Implementing the Deep Q-Network (2017) by M. Roderick et al.

335 questions

votes

2 answers

Can Q-learning be used for continuous (state or action) spaces?

Many examples work with a table-based method for Q-learning. This may be suitable for a discrete state (observation) or action space, like a robot in a grid world, but is there a way to use Q-learning for continuous spaces like the control of a…

asked May 11 '19 at 11:11

Bryan McGill

votes

1 answer

How to deal with a huge action space, where, at every step, there is a variable number of legal actions?

I am working on creating an RL-based AI for a certain board game. Just as a general overview of the game so that you understand what it's all about: It's a discrete turn-based game with a board of size $n \times n$ ($n$ depending on the number of…

reinforcement-learning dqn game-ai action-spaces board-games

asked Mar 06 '19 at 13:23

ytolochko

votes

1 answer

Why does DQN require two different networks?

I was going through this implementation of DQN and I see that on line 124 and 125 two different Q networks have been initialized. From my understanding, I think one network predicts the appropriate action and the second network predicts the target Q…

reinforcement-learning deep-rl q-learning dqn target-network

asked Jul 02 '18 at 07:47

amitection

votes

2 answers

What is the difference between Q-learning, Deep Q-learning and Deep Q-network?

Q-learning uses a table to store all state-action pairs. Q-learning is a model-free RL algorithm, so how could there be the one called Deep Q-learning, as deep means using DNN; or maybe the state-action table (Q-table) is still there but the DNN is…

reinforcement-learning comparison q-learning dqn deep-rl

asked Jan 22 '21 at 09:41

Dee

1,283
1
11
35

votes

1 answer

What exactly is the advantage of double DQN over DQN?

I started looking into the double DQN (DDQN). Apparently, the difference between DDQN and DQN is that in DDQN we use the main value network for action selection and the target network for outputting the Q values. However, I don't understand why…

comparison q-learning dqn deep-rl double-dqn

asked Jul 30 '20 at 19:40

Chukwudi

votes

2 answers

Was DeepMind's DQN learning simultaneously all the Atari games?

DeepMind states that its deep Q-network (DQN) was able to continually adapt its behavior while learning to play 49 Atari games. After learning all games with the same neural net, was the agent able to play them all at 'superhuman' levels…

reinforcement-learning deep-rl dqn deepmind atari-games

asked Oct 20 '16 at 01:42

Dion

votes

1 answer

Is Experience Replay like dreaming?

Drawing parallels between Machine Learning techniques and a human brain is a dangerous operation. When it is done successfully, it can be a powerful tool for vulgarisation, but when it is done with no precaution, it can lead to major…

reinforcement-learning dqn deep-rl experience-replay

asked Sep 09 '18 at 19:07

16Aghnar

votes

1 answer

What are other ways of handling invalid actions in scenarios where all rewards are either 0 (best reward) or negative?

I created an OpenAI Gym environment, and I would like to check the performance of the agent from OpenAI Baselines DQN approach on it. In my environment, the best possible outcome for the agent is 0 - the robot needs zero non-necessary resources to…

reinforcement-learning q-learning dqn implementation reward-functions

asked May 29 '17 at 09:02

AlexGuevara

votes

1 answer

How is the DQN loss derived from (or theoretically motivated by) the Bellman equation, and how is it related to the Q-learning update?

I'm doing a project on Reinforcement Learning. I programmed an agent that uses DDQN. There are a lot of tutorials on that, so the code implementation was not that hard. However, I have problems understanding how one should come up with this kind of…

reinforcement-learning q-learning dqn objective-functions bellman-equations

asked Dec 09 '20 at 18:28

Yves Boutellier

votes

2 answers

What are some online courses for deep reinforcement learning?

What are some (good) online courses for deep reinforcement learning? I would like the course to be both programming and theoretical. I really liked David Silver's course, but the course dates from 2015. It doesn't really teach deep Q-learning at…

reinforcement-learning q-learning dqn deep-rl resource-request

asked Mar 25 '20 at 14:46

J.Doe

votes

2 answers

Can DQN perform better than Double DQN?

I'm training both DQN and double DQN in the same environment, but DQN performs significantly better than double DQN. As I've seen in the double DQN paper, double DQN should perform better than DQN. Am I doing something wrong or is it possible?

reinforcement-learning q-learning dqn deep-rl double-dqn

asked Apr 08 '19 at 09:08

Angelo

votes

2 answers

How to combine backpropagation in neural nets and reinforcement learning?

I have followed a course on machine learning, where we learned about the gradient descent (GD) and back-propagation (BP) algorithms, which can be used to update the weights of neural networks, and reinforcement learning, in particular, Q-learning. I…

neural-networks reinforcement-learning dqn deep-rl backpropagation

asked Dec 04 '17 at 23:12

Yadeses

votes

2 answers

Are policy gradient methods good for large discrete action spaces?

I have seen this question asked primarily in the context of continuous action spaces. I have a large action space (~2-4k discrete actions) for my custom environment that I cannot reduce down further: I am currently trying DQN approaches but was…

reinforcement-learning dqn policy-gradients

asked May 18 '21 at 13:57

user9317212

votes

1 answer

What happens when you select actions using softmax instead of epsilon greedy in DQN?

I understand the two major branches of RL are Q-Learning and Policy Gradient methods. From my understanding (correct me if I'm wrong), policy gradient methods have an inherent exploration built-in as it selects actions using a probability…

reinforcement-learning dqn policy-gradients epsilon-greedy-policy softmax-policy

asked Jun 23 '20 at 16:47

Linsu Han

votes

2 answers

Why are reinforcement learning methods sample inefficient?

Reinforcement learning methods are considered to be extremely sample inefficient. For example, in a recent DeepMind paper by Hessel et al., they showed that in order to reach human-level performance on an Atari game running at 60 frames per second…

reinforcement-learning dqn papers ddpg sample-efficiency

asked Mar 14 '20 at 20:23

rrz0

2 3

…

22 23 Next