Where does reinforcement learning actually show up in Deepmind's game engines?

Asked May 17 '20 at 17:44

Active May 17 '20 at 17:44

Viewed 74 times

From the brief research I've done on the topic, it appears that the way Deepmind's Alphazero or Muzero makes decisions is through Monte Carlo tree searches, where in the randomized simulations allows for a more rapid way to make calculations than traditional alpha-beta pruning. As the simulation space increases, this search approaches that of a classical tree search.

Where exactly did Deepmind use neural networks? Was it in the evaluation portion? And if so, how did they make determinations on what makes a "good" or "bad" game state? If they deferred the evaluations of another chess engine like Stockfish, how do we see AlphaZero absolutely demolish Stockfish in head-to-head matches?

asked May 17 '20 at 17:44

Amar Srivastava

Where does reinforcement learning actually show up in Deepmind's game engines?

0 Answers0