1

I'm recently using Monte Carlo Tree Search in OpenAi Gym Atari, but the result isn't satisfying.

Without render, the game lasts about 180 steps ( env.step() was called this much time ) with random agent. However, my MCTS agent only made the game last 12 steps. And it took pretty much time to give a next step.

I guess it's the problem of rollout. I build the MCTS tree using nodes containing AtariEnv objects, and deepcopy it each time I rollout, add the reward. So it takes about 1 second to expand nodes and rollout, if I do 100 iterations, that would be massive waiting time.

My code of rollout is shown below:

def rollout_(current,if_render):
        '''
        current is going to be a Node object
        '''
        sandBox = deepcopy(current.state)
        endReward = 0
        done = False
        while done != True:
            action=sandBox.action_space.sample()
            _,reward,done,info = sandBox.step(action)# wierd return obs_next
            if reward > 0:
                reward *= 2
            endReward += reward-0.008
        return endReward

Anyone can help?

Dibbla
  • 11
  • 2

0 Answers0