I have a large control problem with multidimensional continuous inputs (13) and outputs (3). I tried several Reinforcement learning algorithms like Deep-Q-Networks (DQN), Proximal Policy Optimization (PPO) and Advantage Actor Critic (A2C). Unfortunately, they all yield poor results. As far as I understand, they are all based on neural networks. Because of this I think it might be possible that the neural network itself could be a problem as it might not be able to learn the mapping between inputs and outputs (I have experienced this in several other applications).
So, are there state-of-the-art reinforcement learning algorithms for large problems with multidimensional continuous state spaces and actions?