In this Stanford lecture (minute 35:47 and 37:00), the professor says that Monte Carlo (MC) linear function approximation does not always converge, and she gives an example. In general, when does MC linear function approximation converge (or not)?
Why do people use that MC linear function approximation if sometimes it doesn't converge?
They also gave the definition of the stationary distribution of a policy, and I am not sure if using it for function approximation converges or not.