4

I want to build model-based RL. I am wondering about the process of building the model.

If I already have data, from real experience:

  • $S_1, a \rightarrow R,S_2$
  • $S_2, a \rightarrow R,S_3$

Can I use this information, to build model-based RL? Or it is necessary that the agent directly interact with the environment (I mean the same above-mentioned data should be provided by the agent)?

Neil Slater
  • 28,678
  • 3
  • 38
  • 60
user46045
  • 43
  • 2

1 Answers1

1

If you already have some transition tuples then you can train a model to predict environment dynamics using these. However, you should be careful that your pre-gathered data is diverse enough to 'cover' enough of the state/action space so that your model remains accurate. For instance, when you start training your agent it will likely start to see more of the state space than it did at the start of training (imagine playing Atari, initially your agent will die quickly but as it gets better episodes will get longer) so you would need to make sure you have data for these states that appear late in episodes, otherwise your model will just be overfitting to the start of the episode and will give a poor performance on these other states, thus slowing down or even prohibiting learning of an optimal policy.

David
  • 4,591
  • 1
  • 6
  • 25
  • 1
    Thank you, your issue is clear for me. So can I use this historical data as an input in model-base rl or I need some additional modeling techniques to estimate stansition? – user46045 Apr 07 '21 at 09:50
  • you could try your model with your historical data, at the least it would give it a good 'initialisation'. you could then update your model once you have some new data that the agent collects. – David Apr 07 '21 at 09:53
  • Could you advise me some good library or tutorial for coding purpose in model-free Rl? – user46045 Apr 07 '21 at 09:57
  • if you want to learn model-free RL then I would look for some medium articles on DQN, they are usually a good resource with well explained code. – David Apr 07 '21 at 09:59
  • Sorry, I mean model-based. It is my mistake – user46045 Apr 07 '21 at 10:00
  • I'm afraid not, I have never worked with model-based methods before. You could look at some papers and see if the authors have a public GitHub repo. – David Apr 07 '21 at 10:03
  • cs285 has a comprehensive lecture on model-based RL @user46045. [lecture video](https://youtu.be/6JDfrPRhexQ?list=PLkFD6_40KJIwhWJpGazJ9VSj9CFMkb79A). [MBRL notes](http://rail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-11.pdf), [Practice code](http://rail.eecs.berkeley.edu/deeprlcourse/static/homeworks/hw4.pdf). I've added these a bit late, but they might help someone else later. – mugoh Jun 21 '21 at 04:55