Why is it necessary to divide the priority range according to the batch size in Prioritized Experience Replay?

Asked Sep 29 '20 at 14:11

Active Nov 01 '20 at 19:04

Viewed 59 times

According to DeepMinds's paper Prioritized Experience Replay (2016), specifically Appendix B.2.1 "Proportional prioritization" (p. 13), one should equally divide the priority range $[0, p_\text{total}]$ into $k$ ranges, where $k$ is the size of the batch, and sample a random variable within these sub-ranges. This random variable is then used to sample an experience from the sum-tree according to its priority (probability).

Why do we need to do that? Why not simply sampling $k$ random variables in $[0, p_\text{total}]$ and getting $k$ variables from the sum-tree without dividing the priority range into $k$ different ranges? Isn't this the same?

edited Nov 01 '20 at 19:04

nbro

39,006
12
98
176

asked Sep 29 '20 at 14:11

Firas_

This question was also asked [here](https://stats.stackexchange.com/q/362036). – nbro Nov 01 '20 at 19:05

Why is it necessary to divide the priority range according to the batch size in Prioritized Experience Replay?

0 Answers0