Why is Thompson Sampling considered a part of Reinforcement Learning?

Question

I often see Thompson Sampling in RL literature, however, I am not able to relate it to any of the current RL techniques. How exactly does it fit with RL?

score 3 · Accepted Answer · answered Dec 05 '21 at 18:08

Thompson Sampling (TS) is used in the context of bandits, which is a special case of the RL problem.

You can also use TS for the full RL problem, but that can lead to inefficient exploration. To know more about this issue, you could read

the section 7.5 Reinforcement Learning in Markov Decision Processes (p. 62) of the tutorial A Tutorial on Thompson Sampling (2017) by Russo et al.,
my answer here, and
the paper Deep Exploration via Randomized Value Functions (2019, JMLR) by Ian Osband et al.

Why is Thompson Sampling considered a part of Reinforcement Learning?

1 Answers1