I found some literature regarding the design of action-spaces and that e.g. a discretization of continuous actions in video-game environments can be crucial for successful learning (Action Space Shaping in Deep Reinforcement Learning, 2020).
However, I didn’t find anything discussing this topic concerning observations. I’m not talking about normalization of observations, as NN are mainly used in DRL as function approximators and they perform better with normalized inputs.
E.g. continuous observations, which might be discretized into ordinal or categorical values. Imagine having a domain where the agent can close or open a gate, and it should learn whether the next object should pass or not. The agent has also to learn how to close the gate (with actions being "reduce or increase gate width", so the agent has to pick one action multiple times in a row to reach some certain width). Now one can design the environment to provide a continuous value for the gate width. Or discretize this and provide flags that indicate the gate is open enough or not. Are there references that discuss this topic?
The reason I’m thinking about this is because considering a policy as a mapping of actions to states, discretizing the observations will eventually lead to less mapping that has to be learned. Surely this can come at the cost of a lower peek performance, but I could also imagine higher training speed.
Is my thinking correctly concerning that? Or could discretizing the observations harm learning in other ways?