0

I'm reading the Sutton & Barto's book "Reinforcement Learning: An Introduction" (2nd Edition), as the classes I took were a long time ago, and I'm struggling to understand this part (p. 12):

To do this, it had to have a model of the game that allowed it to foresee how its environment would change in response to moves that it might never make. Many problems are like this, but in others even a short-term model of the effects of actions is lacking. Reinforcement learning can be applied in either case.

I don't understand how can games not have a model. If there is randomness in the environment's reaction to the agent's event, we still have a model don't we? If we have a game that does not have a model with states and values, how can RL be incorporated?

So my question would be:

Do you know examples where "even a short-term model of the effects of actions is lacking", i.e. where an MDP cannot be drawn or estimated?

I hope this is not too broad of a question. I've mainly done RL on Markov decision processes, and I'm struggling to see how a game would not have the possibility to be modelled by one (even if very big!).

nbro
  • 39,006
  • 12
  • 98
  • 176
  • Can you please put your **specific question** in the title? "Reinforcement learning no model of the game" is not a question and it's also not specific. Thanks. – nbro Feb 28 '22 at 22:52
  • I've tried to clarify the title and the post. Sorry. – FluidMechanics Potential Flows Mar 01 '22 at 12:10
  • But do you understand what "model" refers to in RL? If yes, then your question is not a duplicate, as I thought. Edit your post to clarify this. – nbro Mar 01 '22 at 13:36
  • Well I think I do but I'm not 100% sure otherwise I think I'd understand the statement I've quoted. I've tried editing my post, hopefully it'll be more clear. – FluidMechanics Potential Flows Mar 01 '22 at 14:19
  • So, it seems to me that you're asking: "Is there a game where there is no way to estimate a model of the game?". Is this correct? By the way, the quote "To do this, it had to have a model of the game that allowed it to foresee how its environment would change in response to moves that it might never make.... " was taken from where? – nbro Mar 01 '22 at 15:42
  • Yes, not necessarily a concrete example but I would like to understand how that would be even possible. To me everything can be summarized by states, actions and rewards. It was taken from Reinforcement Learning An Introduction, 2nd Edition – FluidMechanics Potential Flows Mar 01 '22 at 15:43
  • Any idea? @nbro Sorry to ping you but I'm struggling to find an example or a way to understand how that would be possible. – FluidMechanics Potential Flows Mar 08 '22 at 07:55
  • 1
    So, are you asking if there are cases where we wouldn't be able to estimate a model of the environment? And why wouldn't we able able to do so? – nbro Mar 08 '22 at 10:54
  • 1
    I voted to reopen this post, as I don't think it's a duplicate anymore. Anyway, I think that the answer to your question is: they are not saying that it's not possible to estimate the model, they are saying that the model is lacking. So, I cannot think of a problem where it would be impossible to estimate a model. There may be cases where it may be difficult to estimate a good model because maybe the state and action spaces are very big and/or it takes a long time to explore the environment. If the post gets reopened, I may convert this comment into a formal answer. – nbro Mar 08 '22 at 11:04
  • Oh ok so basically they're talking about problems where the state and action spaces are massive and not where there is no state and action space. That's what I was confused about. – FluidMechanics Potential Flows Mar 08 '22 at 16:43

0 Answers0