Questions tagged [model-based-methods]

For questions about model-based reinforcement learning methods (or algorithms). An example of a model-based algorithm is Dyna-Q, which estimates a model of the environment (i.e. the transition function of the associated Markov decision process).

40 questions
86
votes
6 answers

What's the difference between model-free and model-based reinforcement learning?

What's the difference between model-free and model-based reinforcement learning? It seems to me that any model-free learner, learning through trial and error, could be reframed as model-based. In that case, when would model-free learners be…
6
votes
2 answers

Are there RL algorithms that also try to predict the next state?

So far I've developed simple RL algorithms, like Deep Q-Learning and Double Deep Q-Learning. Also, I read a bit about A3C and policy gradient but superficially. If I remember correctly, all these algorithms focus on the value of the action and try…
5
votes
3 answers

Isn't a simulation a great model for model-based reinforcement learning?

Most reinforcement learning agents are trained in simulated environments. The goal is to maximize performance in (often) the same environment, preferably with a minimum amount of interactions. Having a good model of the environment allows to use…
5
votes
2 answers

How can the policy iteration algorithm be model-free if it uses the transition probabilities?

I'm actually trying to understand the policy iteration in the context of RL. I read an article presenting it and, at some point, a pseudo-code of the algorithm is given : What I can't understand is this line : From what I understand, policy…
4
votes
1 answer

How does a model based agent learn the model?

I want to build model-based RL. I am wondering about the process of building the model. If I already have data, from real experience: $S_1, a \rightarrow R,S_2$ $S_2, a \rightarrow R,S_3$ Can I use this information, to build model-based RL? Or it…
user46045
  • 43
  • 2
4
votes
1 answer

What is the difference between a distribution model and a sampling model in Reinforcement Learning?

The book from Sutton and Barto, Reinforcement Learning: An Introduction, define a model in Reinforcement Learning as something that mimics the behavior of the environment, or more generally, that allows inferences to be made about how the…
4
votes
1 answer

Is the state transition matrix known to the agents in a Markov decision processes?

The question is more or less in the title. A Markov decision process consists of a state space, a set of actions, the transition probabilities and the reward function. If I now take an agent's point of view, does this agent "know" the transition…
4
votes
1 answer

Is the minimax algorithm model-based?

Trying to get my head around model-free and model-based algorithms in RL. In my research, I've seen the search trees created via the minimax algorithm. I presume these trees can only be created with a model-based agent that knows the full…
4
votes
1 answer

Why are model-based methods more sample efficient than model-free methods?

Why do model-based methods use fewer samples than model-free methods? Here, I'm specifically referring to model-based methods in which we have to learn a policy and model. I can only think of two reasons for this question: We can potentially obtain…
4
votes
1 answer

How do temporal-difference and Monte Carlo methods work, if they do not have access to model?

In value iteration, we have a model of the environment's dynamics, i.e $p(s', r \mid s, a)$, which we use to update an estimate of the value function. In the case of temporal-difference and Monte Carlo methods, we do not use $p(s', r \mid s, a)$,…
3
votes
2 answers

Is Q-learning a type of model-based RL?

Model-based RL creates a model of the transition function. Tabular Q-Learning does this iteratively (without directly optimizing for the transition function). So, does this make tabular Q-learning a type of model-based RL?
3
votes
1 answer

Why is learning $s'$ from $s,a$ a kernel density estimation problem but learning $r$ from $s,a$ is just regression?

In David Silver's 8th lecture he talks about model learning and says that learning $r$ from $s,a$ is a regression problem whereas learning $s'$ from $s,a$ is a kernel density estimation. His explanation for the difference is that if we are in a…
2
votes
1 answer

Model-based RL algorithms for continuous state space and finite action space

At the beginning, if I have a complete model $p(s' \mid s, a)$ (an assumed true model that describes the environment well enough) and the reward function $r(s,a,s')$. How can I exploit the model and learn a good policy in this situation? Assume that…
2
votes
1 answer

If we can model the environment, wouldn't be meaningless to use a model-free algorithm?

I am trying to understand the concept of model-free and model-based approaches. As far as I understand, having a model of the environment does not mean that an RL agent has to be model-based. It is about the policy. However, if we can model the…
2
votes
0 answers

What kind of reinforcement learning method does AlphaGo Deepmind use to beat the best human Go player?

In reinforcement learning, there are model-based versus model-free methods. Within model-based ones, there are policy-based and value-based methods. AlphaGo Deepmind RL model has beaten the best Go human player. What kind of reinforcement model does…
1
2 3