For questions about OpenAI's gym library, which provides a set of APIs to access different types of environments to train reinforcement learning agents.
Questions tagged [gym]
64 questions
10
votes
3 answers
What do the different actions of the OpenAI gym's environment of 'Pong-v0' represent?
Printing action_space for Pong-v0 gives Discrete(6) as output, i.e. $0, 1, 2, 3, 4, 5$ are actions defined in the environment as per the documentation. However, the game needs only 2 controls. Why do we have this discrepancy? Further, is that…

cur10us
- 211
- 1
- 2
- 4
7
votes
1 answer
Deep Q-Learning "catastrophic drop" reasons?
I am implementing some "classical" papers in Model Free RL like DQN, Double DQN, and Double DQN with Prioritized Replay.
Through the various models im running on CartPole-v1 using the same underlying NN, I am noticing all of the above 3 exhibit a…

Virus
- 71
- 1
- 5
7
votes
1 answer
What are the state-of-the-art results in OpenAI's gym environments?
What are the state-of-the-art results in OpenAI's gym environments? Is there a link to a paper/article that describes them and how these SOTA results were calculated?

Tofara Moyo
- 71
- 2
6
votes
1 answer
Why did the openai's gym website close?
Openai's gym website redirects to the GitHub repository. Why did the openai's gym website close?

Franck Dernoncourt
- 2,626
- 1
- 19
- 31
6
votes
2 answers
My Deep Q-Learning Network does not learn for OpenAI gym's cartpole problem
I am implementing OpenAI gym's cartpole problem using Deep Q-Learning (DQN). I followed tutorials (video and otherwise) and learned all about it. I implemented a code for myself and I thought it should work, but the agent is not learning. I will…

SJa
- 371
- 2
- 15
5
votes
1 answer
How to define an action space when an agent can take multiple sub-actions in a step?
I'm attempting to design an action space in OpenAI's gym and hitting the following roadblock. I've looked at this post which is closely related but subtly different.
The environment I'm writing needs to allow an agent to make between $1$ and $n$…

Seyed Moein Ayyoubzadeh
- 130
- 8
5
votes
1 answer
How powerful is OpenAI's Gym and Universe in board games area?
I'm a big fan of computer board games and would like to make Python chess/go/shogi/mancala programs. Having heard of reinforcement learning, I decided to look at OpenAI Gym.
But first of all, I would like to know, is it possible using OpenAI…

Taissa
- 63
- 4
4
votes
1 answer
What is the mapping between actions and numbers in OpenAI's gym?
In a gym environment, the action space is often a discrete space, where each action is labeled by an integer. I cannot find a way to figure out the correspondence between action and number. For example, in frozen lake, the agent can move Up, Down,…

Llewlyn
- 143
- 4
4
votes
2 answers
How do I get started with multi-agent reinforcement learning?
Is there any tutorial that walks through a multi-agent reinforcement learning implementation (in Python) using libraries such as OpenAI's Gym (for the environment), TF-agents, and stable-baselines-3?
I searched a lot, but I was not able to find any…

Rnj
- 221
- 2
- 6
4
votes
0 answers
Unable to train Coach for Banana-v0 Gym environment
I have just started playing with Reinforcement learning and starting from the basics I'm trying to figure out how to solve Banana Gym with coach.
Essentially Banana-v0 env represents a Banana shop that buys a banana for \$1 on day 1 and has 3 days…

KeepLearning
- 141
- 3
3
votes
1 answer
Finding the true Q-values in gymnaiusm
I'm very interested in the true Q-values of state-action pairs in the classic control environments in gymnasium. Contrary to the usual goal, the ordering of the Q-values itself is irrelevant; a very close to accurate estimation of the Q-values is…

Mark B
- 33
- 3
3
votes
0 answers
How to deal with a moving target in the Lunar Lander environment with DDPG?
I have noticed that DDPG does rather well at solving environments with a static target.
For example, the default of Lunar Lander, the flags do not change position. So the DDPG model learns how to get to the center of the screen and land fairly…

user1779362
- 131
- 2
3
votes
2 answers
How does an episode end in OpenAI Gym's "MountainCar-v0" environment?
I am working on OpenAI's "MountainCar-v0" environment. In this environment, each step that an agent takes returns (among other values) the variable named done of type boolean. The variable gets a True value when the episode ends. However, I am not…

SJa
- 371
- 2
- 15
3
votes
1 answer
What should the action space for the card game Crib be?
I'm working on creating an environment for a card game, which the agent chooses to discard certain cards in the first phase of the game, and uses the remaining cards to play with. (The game is Crib if you are familiar with it.)
How can I make an…

Jordan Coil
- 33
- 3
3
votes
1 answer
Are there OpenAI Gym continuing environments (other than inverted pendulum) and baselines?
I would like to use OpenAI Gym to solve a continuing environment, that is, a problem with a single, never-ending episode (please note I don't mean a continuous environment with continuous state and actions).
The only continuing environment I found…

user118967
- 208
- 1
- 8