Following on from your own software verification-based answer to this question, it seems clear that ordinary (i.e. physical), notions of death or imprisonment are not strong enough constraints on an AI (since it's always possible that a state snapshot has been or can be made).
What is therefore needed is some means of moving the AI into a 'mentally constrained' state, so that (as per the 'formal AI death' paper) what it can subsequently do is limited, even if escapes from an AI-box or is re-instantiated.
One might imagine that this could be done via a form of two-level dialogue, in which:
- The AI is supplied with percepts intended to further constrain it
("explaining the error of it's ways", if you like).
- Its state snapshot is then examined to try and get some indication of whether it is being appropriately persuaded.
In principle, 1. could be done by a human programmer/psychiatrist/philosopher while 2. could be simulated via a 'black box' method such as Monte Carlo Tree Search.
However, is seems likely that this would in general be a monstrously lengthy process that would be better done by a supervisory AI which combined both steps (and which could use more 'whitebox' analysis methods for 2.).
So, to answer the question of "who manages the state", the conclusion seems to be: "another AI" (or at least a program that's highly competent at all of percept generation/pattern recognition/AI simulation).