1

I'm working on a simulation of a motor that is attached to a wing (Later, this will also have a real-life counterpart once I'll assemble all the components in our lab), and I can control the forces/torques that the motor applies. I want to use RL and find an optimal action in terms of

"what force should the motor apply to maximize lift".

To make things clearer, check out the following figure

                                              enter image description here

So for example, I can find $\phi$ at every single time step $t$ (using some 1st-year physics equations and python integrators) and this will be my state $s$.

Are you familiar with approaches that deal with this problem?

Hadar Sharvit
  • 371
  • 1
  • 12
  • Hey, check this [answers](https://ai.stackexchange.com/questions/18425/how-can-i-implement-the-reward-function-for-an-8-dof-robot-arm-with-trpo/18436#18436). Also, take a look at these: [one](https://ai.stackexchange.com/questions/28291/how-would-you-shape-a-reward-function-if-there-was-four-quantities-to-optimize/28298#28298), [two](https://ai.stackexchange.com/questions/24157/how-should-i-define-the-reward-function-to-solve-the-wumpus-game-with-deep-q-lea/24164#24164). Did they answer your question? – Aray Karjauv Sep 06 '22 at 13:09

1 Answers1

0

The core part of RL is the reward function. It belongs to the environment and it is the only way the agent can explore the world given a state.

By constructing an appropriate reward function we can make an agent do exactly what we want.

Here is an example of a double-jointed arm that can move to target locations in the Unity environment.

A reward of +0.1 is provided for each step that the agent's hand is in the goal location. So, the goal of the agent is to maintain its position at the target location for as many time steps as possible.

Initially, the agent performs random actions (random forces on the joints) and eventually learns to follow the target. A high-level idea is described in this video. And here is also related DeepMind's article and paper.

Aray Karjauv
  • 907
  • 8
  • 15