I'm trying to implement the minimax algorithm with alpha beta pruning on a game that works like this:
Player 1 plays (x1, y1).
Player 2 can only see the x-value (x1) that Player 1 played (and not the y-value y1). Player 2 plays (x2, y2).
An action event happens, which may change the heuristic of the current game state.
Player 2 plays (x3, y3).
Player 1 can only see the x-value (x3) that Player 2 played (and not the y-value y3). Player 1 plays (x4, y4).
Action event. The game continues with alternating starting players for a maximum depth of 10.
To do so, I have been treating each turn as you regularly would with the minimax algorithm, with each player making moves given the set of moves already played, including the possibly hidden move from the turn before. However, I've noticed that my algorithm will return moves for Player 2 that assume that Player 1 plays a certain way when, in a "real" game, it may be the case that Player 1 plays something else (and vice versa). For example, if Player 2 could guarantee a win on a given turn under all circumstances (when Player 1 plays first for that series), it might not play optimally when it assumes Player 1 will not play its maximum-strength move.
I believe it is doing this precisely because it assumes that all moves are visible (a fully visible game state). And indeed, if that were the case, the sequence of moves it returns would be optimal.
How can I remedy this?
I do not believe that a probability-based algorithm (e.g. Expectiminimax) is necessary, since the game is entirely deterministic. The partial visibility part is making things difficult, though.
Something tells me that changing the turn order in my algorithm might be a solution to this problem, since the action event is the only time the game heuristic is changed.