5

In the Alpha Zero paper (https://arxiv.org/pdf/1712.01815.pdf) page 13, the input for the NN is described. In the beggining of the page, the authors state that:

"The input to the Neural Network is an N x X x (MT + L) image stack [...]"

From this, I understand that (for one training example) each input feature is an 8x8 plane. (Technically speaking, every value of every 8x8 plane is a feature, but for the purpose of the question let's suppose that a plane is an input feature).

In the description of the table on top of the image, the following statement is made:

"[...] Counts are represented by a single real-valued input; other input features are represented by a one-hot encoding using thespecified number of binary input planes. [...]"

I understand how they convert the P1 and P2 pieces to one-hot encodings. My questions are:

  1. When they say single real-valued input, since every input feature should be an 8x8 plane, do they mean that they create an 8x8 plane where every entry has the same single-real value? For example, for the 'Total move count' plane, if 10 moves had been played in the game so far, it would look like the one below?
  move_count_plane = [[10, 10, 10, 10, 10, 10, 10, 10],
                      [10, 10, 10, 10, 10, 10, 10, 10],
                      [10, 10, 10, 10, 10, 10, 10, 10],
                      [10, 10, 10, 10, 10, 10, 10, 10],
                      [10, 10, 10, 10, 10, 10, 10, 10],
                      [10, 10, 10, 10, 10, 10, 10, 10],
                      [10, 10, 10, 10, 10, 10, 10, 10],
                      [10, 10, 10, 10, 10, 10, 10, 10]]
  1. For the 'Repetitions' plane, is it the same case as above? They mean a plane where every value is the number of times a specific board setup has been reached? For example, if a specific position has been reached 2 times, then the repetitions plane for that position would be
  # for a specific timestep in the T=8 step history
  repetitions_plane = [[2, 2, 2, 2, 2, 2, 2, 2],
                       [2, 2, 2, 2, 2, 2, 2, 2],
                       [2, 2, 2, 2, 2, 2, 2, 2],
                       [2, 2, 2, 2, 2, 2, 2, 2],
                       [2, 2, 2, 2, 2, 2, 2, 2],
                       [2, 2, 2, 2, 2, 2, 2, 2],
                       [2, 2, 2, 2, 2, 2, 2, 2],
                       [2, 2, 2, 2, 2, 2, 2, 2]]

? Also, why do they keep 2 repetitions planes? Is it one for every player? (8 repetition planes for the past T=8 moves for P1, and more 8 repetition planes for the past T=8 moves for P2?)

Thanks in advance.

Andrew
  • 63
  • 4

1 Answers1

6

For anyone wondering, I believe to have found the answer:

  1. Yes, it will be an 8x8 plane where all the entries are the same, the number of moves (or mpves with no progress).

  2. There are two repetitions planes (for each position from the most recent T=8 positions):

    a) The first repetition plane will be a plane where all the entries are 1's if the position is being repeated for the first time. Else 0's.

    b) The second repetition plane will be a plane where all the entries are 1's if the current position is being repeated for the second time. Else 0's.

AndrewSpan
  • 76
  • 2