How define a reward function for a humanoid agent whose goal is to stand up from the ground?

Question

I'm trying to teach a humanoid agent how to stand up after falling. The episode starts with the agent lying on the floor with its back touching the ground, and its goal is to stand up in the shortest amount of time.

But I'm having trouble in regards to reward shaping. I've tried multiple different reward functions, but they all end up the same way: the agent quickly learns to sit (i.e. lifting its torso), but then gets stuck on this local optimum forever.

Any ideas or advice on how to best design a good reward function for this scenario?

A few reward functions I've tried so far:

current_height / goal_height
current_height / goal_height - 1
current_height / goal_height - reward_prev_timestep
(current_height / goal_height)^N (tried multiple different values of N)
...

In section 6.2.2 of [this](https://arxiv.org/pdf/1506.02438.pdf) paper they provide some reward functions so you can try with these. — Brale, May 20 '19 at 17:14

score 0 · Answer 1 · answered Apr 16 '23 at 11:06

0

See https://gymnasium.farama.org/environments/mujoco/humanoid_standup/#rewards for a description of how to get a humanoid ragdoll to stand up!

answered Apr 16 '23 at 11:06

ChrisSim

1

Note that I have got this far so far myself... https://www.youtube.com/watch?v=ISZ_xgybazQ (it reliably rolls over onto it's front) – ChrisSim Apr 16 '23 at 11:09
1

Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Apr 17 '23 at 14:28

How define a reward function for a humanoid agent whose goal is to stand up from the ground?

1 Answers1