1

I have a large control problem with multidimensional continuous inputs (13) and outputs (3). I tried several Reinforcement learning algorithms like Deep-Q-Networks (DQN), Proximal Policy Optimization (PPO) and Advantage Actor Critic (A2C). Unfortunately, they all yield poor results. As far as I understand, they are all based on neural networks. Because of this I think it might be possible that the neural network itself could be a problem as it might not be able to learn the mapping between inputs and outputs (I have experienced this in several other applications).

So, are there state-of-the-art reinforcement learning algorithms for large problems with multidimensional continuous state spaces and actions?

nbro
  • 39,006
  • 12
  • 98
  • 176
PeterBe
  • 212
  • 1
  • 11

1 Answers1

1

There are many state-of-the-art reinforcement learning algorithms for large problems with multidimensional continuous state spaces and actions. All of them rely on some sort of function approximator.

You can use any RL algorithm with really any sort of function approximator, whether a neural network, support vector machine, decision tree, or any other method. Every RL algorithm you mentioned can use any of these function approximators instead of a neural network, if so desired.

However, almost all state of the art results today use neural networks. This is largely due to 2 reasons, one theoretical and the other empirical. The theoretical reason is that neural networks have a universal function approximation theorem which roughly states that given an arbitrarily large network, they can approximate any continuous function. The empurical reason is that for complex problems, neural nets tend to outperform all other methods.

So to more directly address your question. Yes you can use the othe methods I mentioned above. But its probably a bad idea. You are likely better off with either a larger neural network, a neural network architecture more well suited to your problem, or perhaps there are other issues in your approach you have missed. However, it is very unlikely the issue you face is a fundamental problem of neural networks.

chessprogrammer
  • 2,215
  • 2
  • 12
  • 23
  • Thanks for your answer. You wrote that "You are likely better off with either a larger neural network, a neural network architecture more well suited to your problem". How can I change this in the algorithms Deep-Q-Networks (DQN), Proximal Policy Optimization (PPO) and Advantage Actor Critic (A2C)? I currently use StableBaselines3 (https://stable-baselines3.readthedocs.io/en/master/) and here you normally don't specify the network size and type of network. Is it generally possible to use e.g. DQN, PPO and A2C with another type of Neural network, like LSTM or Convolutional NN? – PeterBe Apr 27 '22 at 07:41
  • Thanks chessprogrammer for your answer. Any comments to my last comment? I'll highly appreciate every further comment from you. – PeterBe Apr 29 '22 at 07:41
  • Any further comments? – PeterBe May 02 '22 at 08:46
  • @PeterBe First off, if you appreciated my answer please upvote it and mark as correct. Second, Yes it is generally possible and even common to use all those algorithms with LSTMS or CNNs. If the library you use does not allow that simply use another one. I reccommend this: https://www.tensorflow.org/agents – chessprogrammer May 02 '22 at 15:29
  • 1
    Thanks for your answer. I upvoted and accepted it. – PeterBe May 03 '22 at 08:51