7

I want to learn a lot about the AI of CCG, such as Hearthstone. And now I have known one of the main algorithms that used in this kind of games, MCTS. It analyses the most promising moves, and expands the search tree based on random sampling of the search space. But there are too many random events in this game that can cause different results to one battle. For example, a card can randomly deal X damage to a hero or other follower, and X is a random number from 0 to 30. The number of X is important for the next decision, but there will be a low accuracy by only using MCTS.

So what does the AI do to deal with these random events?

Dennis Soemers
  • 9,894
  • 2
  • 25
  • 66
zen
  • 73
  • 2

1 Answers1

4

The most "standard" implementation of MCTS probably involves storing copies of game states inside nodes. This works fine for deterministic games, but not for non-deterministic games due to the reasons you mentioned.

In non-deterministic games, one of the easiest ways to make MCTS work is to take the perspective that every node in your tree should map to / represent / encode a trajectory or sequence of actions, rather than a state. This means that every node collects statistics (num wins / num visits etc.) for the trajectory of actions it represents, rather than for the state it represents.

This perspective can be implemented very easily by simply not storing any game states in nodes at all. Instead, whenever you traverse the tree (in the Selection phase of MCTS), you should simply re-simulate all the actions along that path through the tree, starting from the root state again. This way, different visits through a single node can result in different game states being "observed" in that node due to nondeterministic game state transitions. In the limit (after an infinite amount of time), every node will estimate the expected value of the trajectory of actions leading to that node, rather than the game-theoretic value of a single, deterministic resulting game state. This whole idea is sometimes referred to as "Open-Loop MCTS", for example in this paper.


What I described above is a simple way to get better performance from MCTS in games involving nondeterministic state transitions as you described (such as random amounts of damage being dealt), but there are many much more complicated ideas out there too. I suspect more complex approaches will especially be necessary to deal with more "drastic" forms of nondeterminism (resulting from random card draws, or resulting from randomization-based approaches for dealing with imperfect information such as not knowing which cards your opponent has).

There was a Hearthstone AI competition relatively recently, and some useful ideas may be found, for example, on this page of work related to that competition.

Dennis Soemers
  • 9,894
  • 2
  • 25
  • 66
  • Hi @Dennis, where can I find pseudo code of open-loop approach? I've found some examples on github but, don't think they are correct. – Kadir Ercetin Jun 02 '21 at 22:23
  • @KadirCan Hmm I'm not sure about any really simple examples. I've used OpenLoop MCTS in both https://github.com/DennisSoemers/MaastCTS2 and https://github.com/Ludeme/Ludii/tree/master/AI/src/search/mcts but in both cases they're "advanced" implementations, also with lots of other optional for bells and whistles, and also including non-open-loop implementations at the same time for deterministic games – Dennis Soemers Jun 03 '21 at 09:13