1

I am working on a project to implement a collision avoidance algorithm on a real unmanned aerial vehicle (UAV).

I'm interested in understanding the process to set up a negative reward to account for scenarios wherein there is a UAV crash. This can be done very easily during the simulation (if the UAV touches any object, the episode stops giving a negative reward). In the real world, a UAV crash would usually entail it hitting a wall or an obstacle, which is difficult to model.

My initial plan is to stop the RL episode and manually input a negative reward (to the algorithm) each time a crash occurs. Any improvements to this plan would be highly appreciated!

nbro
  • 39,006
  • 12
  • 98
  • 176
desert_ranger
  • 586
  • 3
  • 19
  • This is a bad idea. Primarily because RL takes many, many episodes which will are not practical on a real device. You're better off planning using classical MPC for trajectory generation. – FourierFlux May 21 '20 at 23:52

0 Answers0