5

I'm trying to teach a humanoid agent how to stand up after falling. The episode starts with the agent lying on the floor with its back touching the ground, and its goal is to stand up in the shortest amount of time.

But I'm having trouble in regards to reward shaping. I've tried multiple different reward functions, but they all end up the same way: the agent quickly learns to sit (i.e. lifting its torso), but then gets stuck on this local optimum forever.

Any ideas or advice on how to best design a good reward function for this scenario?

A few reward functions I've tried so far:

  • current_height / goal_height
  • current_height / goal_height - 1
  • current_height / goal_height - reward_prev_timestep
  • (current_height / goal_height)^N (tried multiple different values of N)
  • ...
nbro
  • 39,006
  • 12
  • 98
  • 176
Tirafesi
  • 151
  • 1
  • In section 6.2.2 of [this](https://arxiv.org/pdf/1506.02438.pdf) paper they provide some reward functions so you can try with these. – Brale May 20 '19 at 17:14

1 Answers1

0

See https://gymnasium.farama.org/environments/mujoco/humanoid_standup/#rewards for a description of how to get a humanoid ragdoll to stand up!

  • Note that I have got this far so far myself... https://www.youtube.com/watch?v=ISZ_xgybazQ (it reliably rolls over onto it's front) – ChrisSim Apr 16 '23 at 11:09
  • 1
    Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Apr 17 '23 at 14:28