I would like to design a reward function. I am training two models from the first model that classify set of texts (paragraphs and keywords) and I also got some hidden states. The second model is trying to generate keywords for those paragraphs.
I want to use those hidden states from the first model to give rewards for key phrases that are generated from the second model. I want to know how can I implement this reward function since I have never used it before.