For questions about action spaces in the context of reinforcement learning and other AI sub-fields.
Questions tagged [action-spaces]
40 questions
24
votes
2 answers
Are there other approaches to deal with variable action spaces?
This question is about Reinforcement Learning and variable action spaces for every/some states.
Variable action space
Let's say you have an MDP, where the number of actions varies between states (for example like in Figure 1 or Figure 2). We can…

Rikard Olsson
- 341
- 1
- 3
- 8
18
votes
1 answer
How to deal with a huge action space, where, at every step, there is a variable number of legal actions?
I am working on creating an RL-based AI for a certain board game. Just as a general overview of the game so that you understand what it's all about: It's a discrete turn-based game with a board of size $n \times n$ ($n$ depending on the number of…

ytolochko
- 365
- 2
- 5
16
votes
3 answers
How to implement a variable action space in Proximal Policy Optimization?
I'm coding a Proximal Policy Optimization (PPO) agent with the Tensorforce library (which is built on top of TensorFlow).
The first environment was very simple. Now, I'm diving into a more complex environment, where all the actions are not available…

Max
- 163
- 1
- 6
10
votes
3 answers
What do the different actions of the OpenAI gym's environment of 'Pong-v0' represent?
Printing action_space for Pong-v0 gives Discrete(6) as output, i.e. $0, 1, 2, 3, 4, 5$ are actions defined in the environment as per the documentation. However, the game needs only 2 controls. Why do we have this discrepancy? Further, is that…

cur10us
- 211
- 1
- 2
- 4
7
votes
0 answers
Is there a difference in the architecture of deep reinforcement learning when multiple actions are performed instead of a single action?
I've built a deep deterministic policy gradient reinforcement learning agent to be able to handle any games/tasks that have only one action. However, the agent seems to fail horribly when there are two or more actions. I tried to look online for…

Rui Nian
- 423
- 3
- 13
6
votes
1 answer
How does the Alpha Zero's move encoding work?
I am a beginner in AI. I'm trying to train a multi-agent RL algorithm to play chess. One issue that I ran into was representing the action space (legal moves/or honestly just moves in general) numerically. I looked up how Alpha Zero represented it,…

Akshay Ghosh
- 105
- 4
6
votes
1 answer
Are there RL techniques to deal with incremental action spaces?
Let's say we have a problem that can be solved by some RL algorithms (DQN, for example, because we have discrete action space). At first, the action space is fixed (the number of actions is $n_1$), and we have already well trained an offline DQN…

user29643
- 61
- 1
5
votes
1 answer
How to deal with different actions for different states of the environment?
I'm new to this AI/Machine Learning and was playing around with OpenAI Gym a bit. When looking through the environments, I came across the Blackjack-v0 environment, which is a basic implementation of the game where the state is the hand count of the…

SomeDudeCalledMo
- 61
- 1
- 3
5
votes
1 answer
Is the agent aware of a possible different set of actions for each state?
I have a use case where the set of actions is different for different states. Is the agent aware of what actions are valid for each state, or is the agent only aware of the entire action space (in which case I guess the environment needs to discard…

Francis Chang
- 61
- 1
4
votes
1 answer
How should I define the action space for a card game like Magic: The Gathering?
I'm trying to learn about reinforcement learning techniques. I have little background in machine learning from university, but never more than using a CNN on the MNIST database.
My first project was to use reinforcement learning on tic-tac-toe and…

Why Not
- 43
- 5
3
votes
1 answer
Take action only at the beginning of the episode, not during each step
I am working in an reinforcement learning environment with 1-dimensional action space. My action is only used at the first timestep of an episode and never again. In other word the action only affects the agent's behavior at timestep 1 and is not…

Optical_flow_lover
- 41
- 2
3
votes
0 answers
How to deal with variable action ranges in RL for continuous action spaces
I am reading this paper on battery management using RL. The action consist in the charging/discharging power of the battery at timestep $t$. For instance, in the case of the charging power, the maximum of this action can be given by the maximum…

Leibniz
- 69
- 4
3
votes
1 answer
How to use DQN when the action space can be different at different time steps?
I would like to employ DQN to solve a constrained MDP problem. The problem has constraints on action space. At different time steps till the end, the available actions are different. It has different possibilities as below.
0, 1, 2, 3, 4
0, 2, 3,…

ycenycute
- 341
- 1
- 2
- 6
3
votes
1 answer
Why does Deep Q Network outputs multiple Q values?
I am learning Deep RL following this tutorial: https://medium.freecodecamp.org/an-introduction-to-deep-q-learning-lets-play-doom-54d02d8017d8
I understand everything but one detail:
This image shows the difference between a classic Q learning table…

NMO
- 133
- 4
2
votes
2 answers
How can I design a reinforcement learning model for a game with multiple complex actions taken at a time?
I have a steady hex-map and turn-based wargame featuring WWII carrier battles.
On a given turn, a player may choose to perform a large number of actions. Actions can be of many different types, and some actions may be performed independently of each…

Carrier Battles
- 89
- 7