I was told that AlphaGo (or some related program) was not explicitly taught even the rules of Go -- if it was "just given the rulebook", what does this mean? Literally, a book written in English to read?
-
Most likely, they refer to an encoding of rules of Go. – SpiderRico Nov 06 '20 at 19:47
-
okay, how is this encoding done with a neural network? – releseabe Nov 06 '20 at 19:50
-
je ne sais pas :( – SpiderRico Nov 06 '20 at 19:58
1 Answers
it was "just given the rulebook", what does this mean? Literally a book written in English to read?
The program was not given a natural language version of the rules to interpret. That might be an interesting AI challenge in its own way, but none of the current cutting-edge game playing reinforcement learning systems do much in the way of natural language processing.
Instead, "just given the rulebook" is a rough metaphor for what actually happened: The rules of Go were implemented as functions that the game-playing agent could query. The functions can answer things such as "when the board looks like this, what are my valid actions?" and "if the board looks like this, have I won yet?". The board state might be represented by a matrix with stone positions encoded using numbers. Outputs might be a similar matrix of numbers for valid action choices (where the player is allowed to put stones in Go) or perhaps a single number, $0$ for not won yet, $1$ for a move that wins the game.
There may even be further helper functions that help assess moves (e.g. a value for how many enemy stones would be captured if a piece was played in a specific location), but the bare minimum needed is "what moves are valid?" and "has anyone won?". A very common third function, useful for look-ahead planning is "if the board starts like this, and I take that action, what will the board look like next" - with this function, an agent can look ahead to future positions to help search for winning moves.
This approach is common in game playing agents. A learning agent can in theory also learn the rules of the game through trial and error, as long as it receives some feedback when it has broken the rules. However, more often the goal of training the agent is only to play as well as possible - learning the rules from scratch would be extra work for the agent, and maybe not an interesting problem to solve. So the agent is given helper functions by the developers that allow it to explore only valid moves according to the rules.

- 28,678
- 3
- 38
- 60
-
when u say that a human gives it this information it seems like alphago was partially programmable while the actual strategy was derived completely by the neural net (or whatever alphago is). it would be cool if natural language processing was indeed the way it could learn any game, by literally just reading a book and that frankly seems like an easier problem than become the uber-world champ in Go. – releseabe Nov 07 '20 at 01:34
-
@releseabe Natural language processing to the level of understanding documentation for a game that was intended for humans is in fact a far far harder problem than becoming a world champion Go player. Cutting edge research into NLP is not even close, and researchers don't even understand the *problem* fully let alone have an approach that could solve it. This is slightly counter-intuitive since humans seem to take little effort to be able to read a book of rules just fine, whilst becoming a Go master is very hard for a person. DIfficultly of problems in AI is different to human experience. – Neil Slater Nov 07 '20 at 12:41