Highest Voted 'stationary-policy' Questions - Artificial Intelligence Stack Exchange

15

votes

4 answers

What does "stationary" mean in the context of reinforcement learning?

I think I've seen the expressions "stationary data", "stationary dynamics" and "stationary policy", among others, in the context of reinforcement learning. What does it mean? I think stationary policy means that the policy does not depend on time,…

asked Aug 20 '18 at 10:09

Paula Vega

428
4
8

8

votes

1 answer

What is the difference between a stationary and a non-stationary policy?

In reinforcement learning, there are deterministic and non-deterministic (or stochastic) policies, but there are also stationary and non-stationary policies. What is the difference between a stationary and a non-stationary policy? How do you…

reinforcement-learning comparison policies stationary-policy

asked Jun 27 '19 at 15:14

nbro

39,006
12
98
176

2

votes

0 answers

Should I use the discounted average reward as objective in a finite-horizon problem?

I am new to reinforcement learning, but, for a finite horizon application problem, I am considering using the average reward instead of the sum of rewards as the objective. Specifically, there are a total of $T$ maximally possible time steps (e.g.,…

reinforcement-learning q-learning rewards stationary-policy

asked Aug 10 '20 at 06:06

lll

121
2

1

vote

0 answers

Why do bootstrapping methods produce nonstationary targets more than non-bootstrapping methods?

The following quote is taken from the beginning of the chapter on "Approximate Solution Methods" (p. 198) in "Reinforcement Learning" by Sutton & Barto (2018): reinforcement learning generally requires function approximation methods able to handle…

reinforcement-learning monte-carlo-methods temporal-difference-methods stationary-policy bootstrapping

asked Jun 27 '20 at 13:00

Johan

121
4

1

vote

1 answer

What is the difference between the definition of a stationary policy in reinforcement learning and contextual bandit?

A stationary policy is a function that maps a state to a probability distribution of actions. In a contextual bandit problem, a state itself does not include the history. But in a reinforcement learning problem, the history can be used to define a…

machine-learning reinforcement-learning comparison definitions stationary-policy

asked Oct 03 '19 at 18:50

Hunnam

227
1
6

Questions tagged [stationary-policy]

What does "stationary" mean in the context of reinforcement learning?

What is the difference between a stationary and a non-stationary policy?

Should I use the discounted average reward as objective in a finite-horizon problem?

Why do bootstrapping methods produce nonstationary targets more than non-bootstrapping methods?

What is the difference between the definition of a stationary policy in reinforcement learning and contextual bandit?