For questions related to the reinforcement learning algorithm called "expected SARSA" (as described in the book "Reinforcement Learning: An Introduction", by Sutton and Barto, 2nd edition).
Questions tagged [expected-sarsa]
4 questions
6
votes
1 answer
Is Expected SARSA an off-policy or on-policy algorithm?
I understand that SARSA is an On-policy algorithm, and Q-learning an off-policy one.
Sutton and Barto's textbook describes Expected Sarsa thusly:
In these cliff walking results Expected Sarsa was used on-policy, but
in general it might use a…

Y. Xu
- 63
- 1
- 4
5
votes
1 answer
Expected SARSA vs SARSA in "RL: An Introduction"
Sutton and Barto state in the 2018-version of "Reinforcement Learning: An Introduction" in the context of Expected SARSA (p. 133) the following sentences:
Expected SARSA is more complex computationally than Sarsa but, in return, it eliminates the…

F.M.F.
- 311
- 3
- 7
2
votes
1 answer
Why would SARSA diverge (but not Expected SARSA or Q-learning)?
In figure 6.3 (shown below) from Reinforcement Learning: An Introduction (second edition) by Sutton and Barto, SARSA is shown to perform worse asymptotically (after 100k episodes) than in the interim (after 100 episodes) for larger values of alpha…

Quantum Sphinx
- 121
- 3
0
votes
1 answer
What does the figure in Q-learning vs Expected SARSA actually show?
I might be blind.
But I wasn't able to find or figure out what the small difference between Q-learn and SARSA depicts in the following image;
(src).
What does the semi-circle show? and what does the lack of the semi-circle show? I've your eyes with…

nammerkage
- 206
- 1
- 7