What do the terms 'Bellman backup' and 'Bellman error' mean?

Question

Some RL literature use terms such as: 'Bellman backup' and 'Bellman error'. What do these terms refer to?

There's already an answer that addresses both concerns/questions, but, please, next time, focus on one question per post, although, in this case, the terms are highly related (but I still think these "simple" questions could have been asked in separate posts). It may also be a good idea to provide more context (e.g. a link to an article that mentions these terms), although, again, in this case, anyone familiar with RL would be able to understand the question. — nbro, Jun 28 '21 at 13:15

score 3 · Accepted Answer · answered Jun 28 '21 at 12:01

3

A Bellman backup is an application of a Bellman operator. For example, the step

$$ V(x)\leftarrow \alpha(R + \mathbf{E}[V(x')]) + (1-\alpha)V(x) $$

Is a Bellman backup for some learning rate $\alpha$.

A Bellman error is

$$ d(V(x), R + \mathbf{E}[V(x')]) $$

for some metric $d$, usually $d(x, y) = (x-y)^2$.

answered Jun 28 '21 at 12:01

harwiltz

What does 'backup' refer to here? – user529295 Jun 28 '21 at 12:55
2

It refers to propagating information from later states to earlier ones (backward in time sorta) – harwiltz Jun 28 '21 at 13:02
2

It may be a good idea to 1. also provide the [figures of the backups e.g. that you can find in Sutton and Barto's book](http://incompleteideas.net/book/first/figures/figures.html), 2. to link the OP to [this question](https://ai.stackexchange.com/q/11057/2444) about what the Bellman operator is and 3. explain the symbols in your answer. – nbro Jun 28 '21 at 13:12
@nbro Thanks for the references. – user529295 Jun 28 '21 at 13:18

1 Answers1