1

Suppose we have a vacuum cleaner operating in a $1 \times 2$ rectangle consisting of locations $A$ and $B$. The cleaner's actions are Suck, Left, and Right and it can't go out of the rectangle and the squares are either empty or dirty. I know this is an amateur question but how does randomization (for instance flipping a fair coin) avoid entering the infinite loop? Aren't we entering such a loop If the result of the toss is heads in odd tosses and tails in even tosses?

This is the text from the book "Artificial Intelligence: A Modern Approach" by Russell and Norvig

We can see a similar problem arising in the vacuum world. Suppose that a simple reflex vacuum agent is deprived of its location sensor and has only a dirt sensor. Such an agent has just two possible percepts: [Dirty] and [Clean]. It can Suck in response to [Dirty]; what should it do in response to [Clean]? Moving Left fails (forever) if it happens to start in square A, and moving Right fails (forever) if it happens to start in square B. Infinite loops are often unavoidable for simple reflex agents operating in partially observable environments. Escape from infinite loops is possible if the agent can randomize its actions. For example, if the vacuum agent perceives [Clean], it might flip a coin to choose between Right and Left. It is easy to show that the agent will reach the other square in an average of two steps. Then, if that square is dirty, the agent will clean it and the task will be complete. Hence, a randomized simple reflex agent might outperform a deterministic simple reflex agent.

And this is the agent program from the same source:

function REFLEX-VACUUM-AGENT([location,status]) returns an action
 if status = Dirty then return Suck
 else if location = A then return Right
 else if location = B then return Left
nbro
  • 39,006
  • 12
  • 98
  • 176
Emad
  • 205
  • 2
  • 7

1 Answers1

1

how does randomization (for instance flipping a fair coin) avoid entering the infinite loop?

The coin is flipped on each occasion that a decision is required (as opposed to once in order to define the actions that will be taken from that point onwards). Which means the agent can make a different decision each time it encounters the same (faulty detected) state.

This prevents tight loops in a single location that the quote is discussing. The "infinite loop" that is being broken is repeatedly and deterministically making a movement decision that results in no effective movement, thus leaving the agent in exactly the same state as before for all remaining time steps.

Aren't we entering such a loop If the result of the toss is heads in odd tosses and tails in even tosses?

It is not important that the agent might repeat ineffective movements multiple times. Eventually the agent will make a correct decision - this is very likely after only a few failed attempts at most. It is guaranteed in the limit of infinite time steps, unlike the situation where no randomness is added.

Neil Slater
  • 28,678
  • 3
  • 38
  • 60