I'm recently using Monte Carlo Tree Search in OpenAi Gym Atari, but the result isn't satisfying.
Without render, the game lasts about 180 steps ( env.step() was called this much time ) with random agent. However, my MCTS agent only made the game last 12 steps. And it took pretty much time to give a next step.
I guess it's the problem of rollout. I build the MCTS tree using nodes containing AtariEnv objects, and deepcopy it each time I rollout, add the reward. So it takes about 1 second to expand nodes and rollout, if I do 100 iterations, that would be massive waiting time.
My code of rollout is shown below:
def rollout_(current,if_render):
'''
current is going to be a Node object
'''
sandBox = deepcopy(current.state)
endReward = 0
done = False
while done != True:
action=sandBox.action_space.sample()
_,reward,done,info = sandBox.step(action)# wierd return obs_next
if reward > 0:
reward *= 2
endReward += reward-0.008
return endReward
Anyone can help?