Questions tagged [expected-sarsa]

For questions related to the reinforcement learning algorithm called "expected SARSA" (as described in the book "Reinforcement Learning: An Introduction", by Sutton and Barto, 2nd edition).

4 questions
6
votes
1 answer

Is Expected SARSA an off-policy or on-policy algorithm?

I understand that SARSA is an On-policy algorithm, and Q-learning an off-policy one. Sutton and Barto's textbook describes Expected Sarsa thusly: In these cliff walking results Expected Sarsa was used on-policy, but in general it might use a…
5
votes
1 answer

Expected SARSA vs SARSA in "RL: An Introduction"

Sutton and Barto state in the 2018-version of "Reinforcement Learning: An Introduction" in the context of Expected SARSA (p. 133) the following sentences: Expected SARSA is more complex computationally than Sarsa but, in return, it eliminates the…
2
votes
1 answer

Why would SARSA diverge (but not Expected SARSA or Q-learning)?

In figure 6.3 (shown below) from Reinforcement Learning: An Introduction (second edition) by Sutton and Barto, SARSA is shown to perform worse asymptotically (after 100k episodes) than in the interim (after 100 episodes) for larger values of alpha…
0
votes
1 answer

What does the figure in Q-learning vs Expected SARSA actually show?

I might be blind. But I wasn't able to find or figure out what the small difference between Q-learn and SARSA depicts in the following image; (src). What does the semi-circle show? and what does the lack of the semi-circle show? I've your eyes with…
nammerkage
  • 206
  • 1
  • 7