Highest Voted 'model-free-methods' Questions - Artificial Intelligence Stack Exchange

86

votes

6 answers

What's the difference between model-free and model-based reinforcement learning?

What's the difference between model-free and model-based reinforcement learning? It seems to me that any model-free learner, learning through trial and error, could be reframed as model-based. In that case, when would model-free learners be…

asked Nov 07 '17 at 14:10

mynameisvinn

961
1
7
6

5

votes

2 answers

How can the policy iteration algorithm be model-free if it uses the transition probabilities?

I'm actually trying to understand the policy iteration in the context of RL. I read an article presenting it and, at some point, a pseudo-code of the algorithm is given : What I can't understand is this line : From what I understand, policy…

reinforcement-learning comparison model-based-methods model-free-methods policy-iteration

asked Mar 11 '20 at 16:11

Samuel Beaussant

183
3

4

votes

1 answer

Why are state-values alone not sufficient in determining a policy (without a model)?

"If a model is not available, then it is particularly useful to estimate action values (the values of state-action pairs) rather than state values. With a model, state values alone are sufficient to determine a policy; one simply looks ahead one…

reinforcement-learning monte-carlo-methods model-free-methods

asked Aug 07 '20 at 03:57

stoic-santiago

1,121
5
18

4

votes

1 answer

How does policy evaluation work for continuous state space model-free approaches?

How does policy evaluation work for continuous state space model-free approaches? Theoretically, a model-based approach for the discrete state and action space can be computed via dynamic programming and solving the Bellman equation. Let's say you…

reinforcement-learning deep-rl monte-carlo-methods model-free-methods policy-evaluation

asked Feb 19 '20 at 02:26

calveeen

1,251
7
17

4

votes

1 answer

Is the minimax algorithm model-based?

Trying to get my head around model-free and model-based algorithms in RL. In my research, I've seen the search trees created via the minimax algorithm. I presume these trees can only be created with a model-based agent that knows the full…

reinforcement-learning comparison minimax model-based-methods model-free-methods

asked Feb 18 '20 at 00:00

mason7663

603
3
10

4

votes

1 answer

Why are model-based methods more sample efficient than model-free methods?

Why do model-based methods use fewer samples than model-free methods? Here, I'm specifically referring to model-based methods in which we have to learn a policy and model. I can only think of two reasons for this question: We can potentially obtain…

reinforcement-learning comparison model-based-methods model-free-methods sample-efficiency

asked Aug 28 '19 at 14:27

Maybe

441
2
11

4

votes

1 answer

How do temporal-difference and Monte Carlo methods work, if they do not have access to model?

In value iteration, we have a model of the environment's dynamics, i.e $p(s', r \mid s, a)$, which we use to update an estimate of the value function. In the case of temporal-difference and Monte Carlo methods, we do not use $p(s', r \mid s, a)$,…

reinforcement-learning monte-carlo-methods temporal-difference-methods model-based-methods model-free-methods

asked Feb 15 '19 at 04:49

strongguy122

41
1

3

votes

1 answer

Are model-free and off-policy algorithms the same?

In respect of RL, is model-free and off-policy the same thing, just different terminology? If not, what are the differences? I've read that the policy can be thought of as 'the brain', or decision making part, of machine learning application, where…

reinforcement-learning comparison terminology off-policy-methods model-free-methods

asked Feb 01 '20 at 17:30

mason7663

603
3
10

2

votes

1 answer

How to prove importance sampling ratio is uncorrelated with action-value (or state-value) estimate?

In Sutton & Barto (2nd edition), the following is mentioned on page 150 (p. 172 of the pdf), section 7.4: the importance sampling ratio has expected value one (Section 5.9) and is uncorrelated with the estimate. How can we prove the importance…

proofs off-policy-methods sutton-barto importance-sampling model-free-methods

asked Aug 20 '21 at 16:52

user529295

359
1
10

2

votes

1 answer

If we can model the environment, wouldn't be meaningless to use a model-free algorithm?

I am trying to understand the concept of model-free and model-based approaches. As far as I understand, having a model of the environment does not mean that an RL agent has to be model-based. It is about the policy. However, if we can model the…

reinforcement-learning ai-design model-based-methods model-free-methods

asked Aug 04 '21 at 09:37

Ayska

23
4

2

votes

0 answers

What kind of reinforcement learning method does AlphaGo Deepmind use to beat the best human Go player?

In reinforcement learning, there are model-based versus model-free methods. Within model-based ones, there are policy-based and value-based methods. AlphaGo Deepmind RL model has beaten the best Go human player. What kind of reinforcement model does…

reinforcement-learning policies model-based-methods model-free-methods value-based-methods

asked Dec 23 '20 at 06:14

user781486

201
1
5

2

votes

1 answer

Into which subcategories can reinforcement learning be divided?

In the course of a scientific work, I will discuss the different types of reinforcement learning. However, I have difficulties to find these different types. So, into which subcategories can reinforcement learning be divided? For example, the…

reinforcement-learning monte-carlo-methods temporal-difference-methods model-based-methods model-free-methods

asked Jul 03 '20 at 12:12

jackless

23
3

2

votes

1 answer

What is the relation between Monte Carlo and model-free algorithms?

Monte Carlo (MC) methods are methods that use some form of randomness or sampling. For example, we can use an MC method to approximate the area of a circle inside a square: we generate random 2D points inside the square and count the number of…

reinforcement-learning monte-carlo-methods temporal-difference-methods comparison model-free-methods

asked May 13 '19 at 16:12

nbro

39,006
12
98
176

1

vote

1 answer

How does one normalize observations in online reinforcement learning

I was wondering how would one normalize observations to a policy without knowing the upper and lower limits of the environment values. A trivial technique would be normalize each observation by its maximum value before inputting it into a policy.…

reinforcement-learning deep-rl model-free-methods

asked Jul 09 '23 at 19:13

desert_ranger

586
3
19

1

vote

1 answer

In deep reinforcement learning, what is this model with state as input and value as output?

I was looking at this implementation for creating an agent for playing Tetris using DeepRL. This model uses "a state based on the statistics of the board after a potential action. All predictions would be compared but the action with the best state…

reinforcement-learning q-learning deep-rl model-based-methods model-free-methods

asked Aug 09 '21 at 16:57

JeanMi

155
4

Questions tagged [model-free-methods]