In the Alpha Zero paper (https://arxiv.org/pdf/1712.01815.pdf) page 13, the input for the NN is described. In the beggining of the page, the authors state that:
"The input to the Neural Network is an N x X x (MT + L) image stack [...]"
From this, I understand that (for one training example) each input feature is an 8x8 plane. (Technically speaking, every value of every 8x8 plane is a feature, but for the purpose of the question let's suppose that a plane is an input feature).
In the description of the table on top of the image, the following statement is made:
"[...] Counts are represented by a single real-valued input; other input features are represented by a one-hot encoding using thespecified number of binary input planes. [...]"
I understand how they convert the P1 and P2 pieces to one-hot encodings. My questions are:
- When they say single real-valued input, since every input feature should be an 8x8 plane, do they mean that they create an 8x8 plane where every entry has the same single-real value? For example, for the 'Total move count' plane, if 10 moves had been played in the game so far, it would look like the one below?
move_count_plane = [[10, 10, 10, 10, 10, 10, 10, 10],
[10, 10, 10, 10, 10, 10, 10, 10],
[10, 10, 10, 10, 10, 10, 10, 10],
[10, 10, 10, 10, 10, 10, 10, 10],
[10, 10, 10, 10, 10, 10, 10, 10],
[10, 10, 10, 10, 10, 10, 10, 10],
[10, 10, 10, 10, 10, 10, 10, 10],
[10, 10, 10, 10, 10, 10, 10, 10]]
- For the 'Repetitions' plane, is it the same case as above? They mean a plane where every value is the number of times a specific board setup has been reached? For example, if a specific position has been reached 2 times, then the repetitions plane for that position would be
# for a specific timestep in the T=8 step history
repetitions_plane = [[2, 2, 2, 2, 2, 2, 2, 2],
[2, 2, 2, 2, 2, 2, 2, 2],
[2, 2, 2, 2, 2, 2, 2, 2],
[2, 2, 2, 2, 2, 2, 2, 2],
[2, 2, 2, 2, 2, 2, 2, 2],
[2, 2, 2, 2, 2, 2, 2, 2],
[2, 2, 2, 2, 2, 2, 2, 2],
[2, 2, 2, 2, 2, 2, 2, 2]]
? Also, why do they keep 2 repetitions planes? Is it one for every player? (8 repetition planes for the past T=8 moves for P1, and more 8 repetition planes for the past T=8 moves for P2?)
Thanks in advance.