How to deal with variable action ranges in RL for continuous action spaces

Asked Feb 24 '22 at 10:22

Active Mar 29 '22 at 10:49

Viewed 191 times

I am reading this paper on battery management using RL. The action consist in the charging/discharging power of the battery at timestep $t$. For instance, in the case of the charging power, the maximum of this action can be given by the maximum charging speed $c^{\max }$ or by the state of charge of the battery, since it cannot be charged more than $100\%$. Therefore, the charging action has the following range:

$$ 0 \leq c_{t} \leq \min \left\{c^{\max }, \frac{B^{\max }-B_{t}}{\eta_{c}}\right\} $$

In some timesteps the maximum of $c_{t}$ will be $c^{\max }$ and for others $\frac{B^{\max }-B_{t}}{\eta_{c}}$. What would be the best way of implementing a variable action range? I have thought in using a range $[0,1]$ for the action, scaling it to the suitable range. Is there any standard way to deal with variable ranges?.

edited Mar 29 '22 at 10:49

asked Feb 24 '22 at 10:22

Leibniz

[Here](https://ai.stackexchange.com/q/9491/2444) is a related question, but your question seems to be in the context of continuous action spaces, so I would recommend that you change your title to emphasize that. See also [this](https://ai.stackexchange.com/q/7755/2444). – nbro Feb 28 '22 at 10:16

How to deal with variable action ranges in RL for continuous action spaces

0 Answers0