Are policy-based methods better than value-based methods only for large action spaces?

Asked Jun 23 '20 at 07:35

Active Jun 23 '20 at 10:45

Viewed 50 times

In different books on reinforcement learning, policy-based methods are motivated by their ability to handle large (continuous) action spaces. Is this the only motivation for the policy-based methods? What if the action space is tiny (say, only 9 possible actions), but each action costs a huge amount of resources and there is no model for the MDP, would this also be a good application of policy-based methods?

edited Jun 23 '20 at 10:45

nbro

39,006
12
98
176

asked Jun 23 '20 at 07:35

tmaric

Are policy-based methods better than value-based methods only for large action spaces?

0 Answers0