I have a huge discrete action space, the learning stability is not good. I'd like to move to continuous action space but the only output for my task can be a positive integer (let's say in the range 0 to 999). How can I force the DNN to output a positive integer?
Asked
Active
Viewed 63 times