How should I design a reward function for a NLP problem where two models interoperate?

Asked Apr 16 '20 at 12:29

Active Apr 16 '20 at 14:12

Viewed 75 times

I would like to design a reward function. I am training two models from the first model that classify set of texts (paragraphs and keywords) and I also got some hidden states. The second model is trying to generate keywords for those paragraphs.

I want to use those hidden states from the first model to give rewards for key phrases that are generated from the second model. I want to know how can I implement this reward function since I have never used it before.

edited Apr 16 '20 at 14:12

nbro

39,006
12
98
176

asked Apr 16 '20 at 12:29

No Na

1

the question sounds a bit confusing to me, can you reframe it explain in more details which models you're using and where RL comes into play? I don't get what kind of policy you want to train. From what I understood you have a model that classify some text into paragraphs and another one that extract keywords for each paragraph, is it correct? – Edoardo Guerriero Apr 16 '20 at 13:39

How should I design a reward function for a NLP problem where two models interoperate?

0 Answers0