An important property of a reinforcement learning problem is whether the environment of the agent is static, which means that nothing changes if the agent remains inactive. Different learning methods assume in varying degrees that the environment is static.
How can I check if and (if so) where in the Monte Carlo algorithm, temporal difference learning (TD(0)), the Dyna-Q architecture, and R-Max a static environment is implicitly assumed?
How could I modify the relevant learning methods so that they can in principle adapt to changing environments? (It can be assumed that $\epsilon$ is sufficiently large.)