Does apprenticeship learning require prospective data?

Asked Feb 19 '20 at 11:32

Active Feb 19 '20 at 18:06

Viewed 29 times

I am thinking of applying apprenticeship learning on retrospective data. From looking at this paper by Ng https://ai.stanford.edu/~ang/papers/icml04-apprentice.pdf which talks about apprenticeship learning, it seems to me that at the 5th step of the algorithm,

Compute (or estimate) $μ^{(i)}$ = $μ(π^{(i)})$, where $\mu^{(i)}$ = $E[\sum_{t=0}^{∞}\gamma^{t}$$\phi(s_{t})$ | $\pi^{(i)}]$, $\phi(s_{t})$ is the reward feature vector at state $s_t$.

From my understanding, a sequence of $s_0, s_1, s_2 ..$ trajectory would have to be generated at this step, following this policy $\pi^{(i)}$. Hence, applying this algorithm on retrospective data would not work?

edited Feb 19 '20 at 18:06

nbro

39,006
12
98
176

asked Feb 19 '20 at 11:32

calveeen

1,251
7
17

This is a relatively old question, but by "retrospective data" or "prospective data" (as in the title), do you mean data generated with previous policies? I suggest that you edit your post and use more common terms to clarify your question. – nbro Oct 13 '20 at 12:05

Does apprenticeship learning require prospective data?

0 Answers0