Questions tagged [observation-spaces]

For questions about observation spaces in the context of reinforcement learning and other AI sub-fields.

10 questions
4
votes
1 answer

Are multi agent or self-play environments always automatically POMDPs?

As part of my thesis, I'm working on a zero sum game with RL to train an agent. The game is a real-time game, a derivation of pong, one could imagine playing pong with both sides being foosball rods. As I see it, this is an MDP with perfect…
3
votes
2 answers

Should I apply normalization to the observations in deep reinforcement learning?

I am new to DRL and trying to implement my custom environment. I want to know if normalization and regularization techniques are as important in RL as in Deep Learning. In my custom environment, the state/observation values are in a different range.…
3
votes
2 answers

What happens when the agent faces a state that never before encountered?

I have a network with nodes and links, each of them with a certain amount of resources (that can take discrete values) at the initial state. At random time steps, a service is generated, and, based on the agent's action, the network status changes,…
2
votes
0 answers

What to look out for when designing an environment regarding observations?

When designing an environment, what should one look out for when designing the observation space to make the environment as easy to be learnable for an agent as possible? E.g. make sure the markov property is fulfilled if possible, but I mean also…
2
votes
0 answers

How do neural networks deal with inputs of different sizes that are padded in order to have them of the same size?

I am trying to create an environment for RL where the size of my input (observation space) is not fixed. As a way around it, I thought about padding the size to a maximum value and then assigning "null" to those values that do not exist. Now, these…
1
vote
1 answer

Variable observation space at each episode

I have an enviroment with continuous actions and state variables. Every time I reset my env, between 2 and 5 balls spawn randomly in a box of 100x100 size. One of those balls (the red one) will receive an action (direction of movement) and will move…
1
vote
1 answer

Scrabble rack observation with MuZero

Currently I'm trying to implement Scrabble with MuZero. The $15 \times 15$ game board observation (as input) is of size $27 \times15 \times15$ (26 letters + 1 wildcard) with a value of 0 or 1. However I'm having difficulties finding a suitable way…
1
vote
0 answers

Does the order in which the features are concatenated to create the state (or observation) matter?

I'm experimenting with an RL agent that interacts with the following environment. The learning algorithm is double DQN. The neural network represents the function from state to action. It's build with Keras sequential model and has two dense layers.…
0
votes
1 answer

What are the differences between loss surfaces that "derive"from different observations?

If I understand right that each observation whithin a dataset, creates a different loss surface where we want to find the global minimum. How different those surfaces one from another? Would it be correct to say that they differ like (for example)…
Igor
  • 181
  • 10
0
votes
1 answer

When should discretization of observations be considered?

I found some literature regarding the design of action-spaces and that e.g. a discretization of continuous actions in video-game environments can be crucial for successful learning (Action Space Shaping in Deep Reinforcement Learning,…