For questions about continuous action spaces in the context of reinforcement learning (or other artificial intelligence sub-fields). There is also the tag for discrete action spaces.
Questions tagged [continuous-action-spaces]
32 questions
18
votes
2 answers
Can Q-learning be used for continuous (state or action) spaces?
Many examples work with a table-based method for Q-learning. This may be suitable for a discrete state (observation) or action space, like a robot in a grid world, but is there a way to use Q-learning for continuous spaces like the control of a…

Bryan McGill
- 431
- 1
- 3
- 12
6
votes
1 answer
What techniques are used to make MDP discrete state space manageable?
Generating a discretized state space for an MDP (Markov Decision Process) model seems to suffer from the curse of dimensionality.
Supposed my state has a few simple features:
Feeling: Happy/Neutral/Sad
Feeling: Hungry/Neither/Full
Food left:…

Brendan Hill
- 263
- 1
- 6
6
votes
1 answer
How are continuous actions sampled (or generated) from the policy network in PPO?
I am trying to understand and reproduce the Proximal Policy Optimization (PPO) algorithm in detail. One thing that I find missing in the paper introducing the algorithm is how exactly actions $a_t$ are generated given the policy network…

Daniel B.
- 805
- 1
- 4
- 13
5
votes
1 answer
It is possible to solve a problem with continuous action spaces and no states with reinforcement learning?
I want to use Reinforcement Learning to optimize the distribution of energy for a peak shaving problem given by a thermodynamical simulation. However, I am not sure how to proceed as the action space is the only thing that really matters, in this…

FS93
- 145
- 6
4
votes
1 answer
Can a large discrete action space be represented using Gaussian distributions?
I have a large 1D action space, e.g. dim(A)=2000-10000. Can I use continuous action space where I could learn the mean and std of the Gaussian distributions that I would use to sample action from and round the value to the nearest integer? If yes,…

Mika
- 331
- 1
- 8
4
votes
0 answers
What is the simplest policy gradient method to implement for a problem continuous action space?
I have a problem I would like to tackle with RL, but I am not sure if it is even doable.
My agent has to figure out how to fill a very large vector (let's say from 600 to 4000 in the most complex setting) made of natural numbers, i.e. a 600 vector…

FS93
- 145
- 6
3
votes
0 answers
How to deal with variable action ranges in RL for continuous action spaces
I am reading this paper on battery management using RL. The action consist in the charging/discharging power of the battery at timestep $t$. For instance, in the case of the charging power, the maximum of this action can be given by the maximum…

Leibniz
- 69
- 4
2
votes
1 answer
Can neural networks have continuous inputs and outputs, or do they have to be discrete?
In general, can ANNs have continuous inputs and outputs, or do they have to be discrete?
So, basically, I would like to have a mapping of continuous inputs to continuous outputs. Is this possible? Does this depend on the type of ANN?
More…

PeterBe
- 212
- 1
- 11
1
vote
0 answers
How to generalize finite MDP to general MDP?
Suppose, for simplicity sake, to be in a discrete time domain with the action set being the same for all states $S \in \mathcal{S}$. Thus, in a finite Markov Decision Process, the sets $\mathcal{A}$, $\mathcal{S}$, and $\mathcal{R}$ have a finite…

gvgramazio
- 696
- 2
- 7
- 19
1
vote
1 answer
Model-based learning in continuous state and action spaces
I am interested in learning how transition probabilities/mdps are constructed in continuous state and action space model-based learning setting. There is some literature available on this matter, but they do not explicitly construct the model to…

hogger
- 11
- 2
1
vote
1 answer
RL - Can RL be applied to problems where the next state is not the next observation?
I'm quite new on the study of reinforcement learning, and Im working on a communication problem with continuous large actions range for my final graduation work. I'm trying to use Gaussian Policy and Police Gradient methods for that implementation.…

MaarcosNascimen
- 21
- 2
1
vote
1 answer
What would be the Bellman optimality equation for $q_∗(s, a)$ for an MDP with continuous states and actions?
I'm currently studying Reinforcement Learning and I'd like to know what would be the Bellman optimality equation for action values $q_∗(s, a)$ for a MDP with continuous states and actions, written out using explicit integration (no expectation…

user
- 145
- 9
1
vote
0 answers
How can I get an integer as output for continuous action space PPO reinforcement learning?
I have a huge discrete action space, the learning stability is not good. I'd like to move to continuous action space but the only output for my task can be a positive integer (let's say in the range 0 to 999). How can I force the DNN to output a…

D.g
- 111
- 1
1
vote
0 answers
What Kind of Reinforcement Learning Algorithms Can Be Used when the Action Space is Unfeasibly Large?
I know Deep Q network as a $S\times A$ DNN which maps the $S$ dimensional statespace to q-values of $A$ distinct actions.
In my problem, the action space is still discrete, and finite, but depending on some parameters (e.g. number of users in a…

Della
- 111
- 2
1
vote
1 answer
Reinforcement learning algorithms for large problems that are not based on a neural network
I have a large control problem with multidimensional continuous inputs (13) and outputs (3). I tried several Reinforcement learning algorithms like Deep-Q-Networks (DQN), Proximal Policy Optimization (PPO) and Advantage Actor Critic (A2C).…

PeterBe
- 212
- 1
- 11