What constitutes a large space state (in Q-learning)?

Question

I know this might be specific to different problems, but does anyone know if there is any rule of thumb or references on what constitutes a large state space?

I know that, according to multiple papers, tabular Q-learning is not suitable for these problems with large state spaces, but I can't find any indication of what would constitute a large state space

I've been working on a problem where my state space is about 30x30 and updating my Q-learning with the tabular method runs and works well. Would 100x100 start to become too big or 400x400?

I know that this is true, but maybe you should cite 1-2 papers that state that Q-learning is not suitable for problems with "large" state spaces. — nbro, Nov 28 '20 at 11:59

Neil Slater · Answer 1 · 2020-11-28T12:51:00.177

I know this might be specific to different problems but does anyone know if there is any rule of thumb or references on what constitutes a large state space?

Not really, it is all relative. There are two main ways in which the scale of a value table might be too much:

Memory required to represent the table. This is relatively simple to calculate for any size.
Time required to populate the table with accurate estimates. This depends on how you are collecting that data, and how much variance there is given the same state, action choices.

If you are using action values you need to allow for the fact that the table size is not just $|\mathcal{S}|$, but $|\mathcal{S} \times \mathcal{A}|$.

If you are running fast, local simulations of the environment, then table sizes of a million or even a hundred million are not unreasonable. Software language or library choice for the simulation and agent can have a significant impact at these larger scales.

If any of the state description is a continuous variable, then the table size would become infinite in theory, so no finite table that could actually be realised on a computer could fully capture it. You must use some form of approximation to get practical results. However, even then you can still use tabular approaches when approximation involves discretising the state variable directly - there are a few different methods used in machine learning to do that - e.g. tile coding. You can also use tile coding in large discrete spaces.

I've been working on a problem where my state space is about 30x30 and updating my Q-learning with the tabular method runs and works well. Would 100x100 start to become too big or 400x400?

It depends on the action space too, but 400x400 gives state space of 160,000. Assuming < 10 actions, then this is still well within the realm of tabular methods. You probably have less than 1,000,000 parameters to handle for a tabular method, which compares well to a moderately-sized neural network, with the advantage of better stability and possibility to create a fully optimal agent. If simulating the environemnt is fast, something at that scale may only take a few minutes to a few hours to fully optimise in tabular form.

As with most machine learning though, if you care about some optimisation, such as memory size of the agent, speed at which it solves the problem, or some other metric, then you will need to try experiments with different approaches. Then you will have some knowledge about best approach for your problem, given your definition of "best agent". The experience you gain with that might extend and apply to similar problems in future.

What constitutes a large space state (in Q-learning)?

1 Answers1