2

I have read about methods that apply continual learning strategies to reinforcement learning.

Since reinforcement learning also learns step by step (i.e., task by task, in a sense) during the training phase, why isn't it itself considered a continual learning strategy?

Of course, I understand that if an agent catastrophically forgets previously learned tasks, there is a need to prevent this and therefore develop strategies to mitigate catastrophic forgetting, but my question is more about the definition. If continuous learning (or online learning) is about learning one task at a time, and RL somehow does this, why is it not considered a continual learning strategy (regardless of the fact that it may not be as effective)?

To clarify, I haven't read anywhere the claim that RL is not a CL approach, but also none that it would be. Only the fact that CL methods are proposed for RL gives me the impression that RL is not considered an approach. Nor have I seen anyone mention RL for this purpose. I'm just wondering why that is.

convaldo
  • 121
  • 3
  • To support your claims that RL is not a continual learning approach, you should probably cite 2-3 research papers that claim that. – nbro Apr 29 '21 at 10:23
  • I didn't see any papers particularly stating that it is not, but since there are suggestions for other methods on top of RL, it seems like that it isn't considered as CL. But that might be just an impression on me, perhaps it is wrong. That's why I'm asking – convaldo Apr 29 '21 at 11:06
  • The reason why I am asking this is because I know people that have done research on RL by considering it a continual learning technique, which is a reasonable thing to do. In any case, I know, as you also know, that people have developed approaches to deal with catastrophic forgetting in RL too, and that's probably why you're asking this question. So, I suggest that you at least cite 1-2 papers that do this and that made you ask this question. – nbro Apr 29 '21 at 11:29
  • 1
    Anyway, I think the answer to your question is that it depends on your definition of "continual learning". If, by continual learning, you mean "continually learning without catastrophic forgetting", then, RL, is generally not a continual learning technique, because we are aware of the fact that certain RL algorithms suffer from CF. However, if you just mean "the potential ability to continually learn from more and more data", then RL could be considered a continual learning technique. – nbro Apr 29 '21 at 11:31
  • Ah okay! I'm glad you support my assumption that it depends on the definition. Can you give me some examples for papers in which RL was used as CL technique? – convaldo Apr 29 '21 at 12:31

0 Answers0