For questions related to the concept of a stationary policy (in reinforcement learning and other AI sub-fields).
Questions tagged [stationary-policy]
5 questions
15
votes
4 answers
What does "stationary" mean in the context of reinforcement learning?
I think I've seen the expressions "stationary data", "stationary dynamics" and "stationary policy", among others, in the context of reinforcement learning. What does it mean? I think stationary policy means that the policy does not depend on time,…

Paula Vega
- 428
- 4
- 8
8
votes
1 answer
What is the difference between a stationary and a non-stationary policy?
In reinforcement learning, there are deterministic and non-deterministic (or stochastic) policies, but there are also stationary and non-stationary policies.
What is the difference between a stationary and a non-stationary policy? How do you…

nbro
- 39,006
- 12
- 98
- 176
2
votes
0 answers
Should I use the discounted average reward as objective in a finite-horizon problem?
I am new to reinforcement learning, but, for a finite horizon application problem, I am considering using the average reward instead of the sum of rewards as the objective. Specifically, there are a total of $T$ maximally possible time steps (e.g.,…

lll
- 121
- 2
1
vote
0 answers
Why do bootstrapping methods produce nonstationary targets more than non-bootstrapping methods?
The following quote is taken from the beginning of the chapter on "Approximate Solution Methods" (p. 198) in "Reinforcement Learning" by Sutton & Barto (2018):
reinforcement learning generally requires function approximation methods able to handle…

Johan
- 121
- 4
1
vote
1 answer
What is the difference between the definition of a stationary policy in reinforcement learning and contextual bandit?
A stationary policy is a function that maps a state to a probability distribution of actions.
In a contextual bandit problem, a state itself does not include the history. But in a reinforcement learning problem, the history can be used to define a…

Hunnam
- 227
- 1
- 6