In the book "Reinforcement Learning: An Introduction" (2018) Sutton and Barto explain, on page 221, a form of tile coding using hashing, to reduce memory consumption.
I have two questions about that:
How can this approach reduce memory consumption? Doesn't it just depend on the number of tiles (you have to store one weight for each tile)?
They state that there is only a "little loss of performance". In my understanding, the sense of tile coding (and coarse coding) is, that near-by states have many tiles in common and far-away states have only few tilings in common. With tilings "randomly spread throughout the state space" this isn't the case. How does this not influence performance?