I'm trying to implement a MCTS-based AI for a trick-taking card game.
The game : (Belote)
- The play consists of 8 tricks.
- A trick consists in each of the 4 players play successively 1 (legal) card from their hand (starting with 8 cards).
- A player can only play certain cards from his hand (depends on the current state of the trick). Thus, the other players can infer information of some other players hand based on the current and past tricks.
My implementation so far:
The node:
The current state of the round (previous tricks, current trick, the play order...) + the position of the "smart" player.
Expansion function:
I can think of 2 possibilities :
- a card is played by a player
- a whole new trick is played
Other tools that I implemented:
- A method that gives the legal cards of a hand (based on the current trick).
- A method that gives the smart player all the possible cards of another player (based on the current and previous tricks).
- A method that generates every possible next (legal) trick (based on the 2 previous methods).
Problem :
If I want to apply MCTS, I need to ensure that the succession of the simulated remaining tricks (or played cards) until the end of the round (ie the path in the tree) will not be a dead-end before reaching the final (32th) card. Indeed, a player could run out of legal cards during the simulation...
How can I deal with that constraint in an MCTS context ?
Is MCTS even a viable algorithm for what I want to achieve ?
I could of course precompute the whole possible tree ahead of time, but his size would be huge, and the interest of the MCTS is kind of lost.