Questions tagged [deep-rl]

For questions related to deep reinforcement learning (DRL), that is, RL combined with deep learning. More precisely, deep neural networks are used to represent e.g. value functions or policies.

487 questions
24
votes
2 answers

Are there other approaches to deal with variable action spaces?

This question is about Reinforcement Learning and variable action spaces for every/some states. Variable action space Let's say you have an MDP, where the number of actions varies between states (for example like in Figure 1 or Figure 2). We can…
22
votes
3 answers

Why doesn't Q-learning converge when using function approximation?

The tabular Q-learning algorithm is guaranteed to find the optimal $Q$ function, $Q^*$, provided the following conditions (the Robbins-Monro conditions) regarding the learning rate are satisfied $\sum_{t} \alpha_t(s, a) = \infty$ $\sum_{t}…
nbro
  • 39,006
  • 12
  • 98
  • 176
18
votes
1 answer

How does LSTM in deep reinforcement learning differ from experience replay?

In the paper Deep Recurrent Q-Learning for Partially Observable MDPs, the author processed the Atari game frames with an LSTM layer at the end. My questions are: How does this method differ from the experience replay, as they both use past…
17
votes
1 answer

Why does DQN require two different networks?

I was going through this implementation of DQN and I see that on line 124 and 125 two different Q networks have been initialized. From my understanding, I think one network predicts the appropriate action and the second network predicts the target Q…
16
votes
2 answers

What is the difference between Q-learning, Deep Q-learning and Deep Q-network?

Q-learning uses a table to store all state-action pairs. Q-learning is a model-free RL algorithm, so how could there be the one called Deep Q-learning, as deep means using DNN; or maybe the state-action table (Q-table) is still there but the DNN is…
Dee
  • 1,283
  • 1
  • 11
  • 35
14
votes
2 answers

How large should the replay buffer be?

I'm learning DDPG algorithm by following the following link: Open AI Spinning Up document on DDPG, where it is written In order for the algorithm to have stable behavior, the replay buffer should be large enough to contain a wide range of…
11
votes
1 answer

What exactly is the advantage of double DQN over DQN?

I started looking into the double DQN (DDQN). Apparently, the difference between DDQN and DQN is that in DDQN we use the main value network for action selection and the target network for outputting the Q values. However, I don't understand why…
Chukwudi
  • 349
  • 2
  • 7
10
votes
3 answers

How can you represent the state and action spaces for a card game in the case of a variable number of cards and actions?

I know how a machine can learn to play Atari games (Breakout): Playing Atari with Reinforcement Learning. With the same technique, it is even possible to play FPS games (Doom): Playing FPS Games with Reinforcement Learning. Further studies even…
10
votes
2 answers

Was DeepMind's DQN learning simultaneously all the Atari games?

DeepMind states that its deep Q-network (DQN) was able to continually adapt its behavior while learning to play 49 Atari games. After learning all games with the same neural net, was the agent able to play them all at 'superhuman' levels…
Dion
  • 203
  • 2
  • 6
9
votes
2 answers

What are the biggest barriers to get RL in production?

I am studying the state of the art of Reinforcement Learning, and my point is that we see so many applications in the real world using Supervised and Unsupervised learning algorithms in production, but I don't see the same thing with Reinforcement…
8
votes
1 answer

Is Experience Replay like dreaming?

Drawing parallels between Machine Learning techniques and a human brain is a dangerous operation. When it is done successfully, it can be a powerful tool for vulgarisation, but when it is done with no precaution, it can lead to major…
16Aghnar
  • 591
  • 2
  • 10
8
votes
2 answers

What is experience replay in laymen's terms?

I've been reading Google's DeepMind Atari paper and I'm trying to understand the concept of "experience replay". Experience replay comes up in a lot of other reinforcement learning papers (particularly, the AlphaGo paper), so I want to understand…
8
votes
2 answers

Where to publish a first article in Deep Reinforcement Learning?

What would be examples of journals that are good for a first publication in the field of Deep Reinforcement Learning? I am in the process of writing about the research results of DQN-related algorithms. I have 3 requirements - it should be indexed…
Evalds Urtans
  • 377
  • 3
  • 9
8
votes
2 answers

What are some online courses for deep reinforcement learning?

What are some (good) online courses for deep reinforcement learning? I would like the course to be both programming and theoretical. I really liked David Silver's course, but the course dates from 2015. It doesn't really teach deep Q-learning at…
8
votes
2 answers

Can DQN perform better than Double DQN?

I'm training both DQN and double DQN in the same environment, but DQN performs significantly better than double DQN. As I've seen in the double DQN paper, double DQN should perform better than DQN. Am I doing something wrong or is it possible?
Angelo
  • 201
  • 2
  • 16
1
2 3
32 33