Questions tagged [rlhf]

For questions related to RLHF: Reinforcement Learning from Human Feedback

For questions related to Reinforcement Learning from Human Feedback (RLHF).

4 questions
2
votes
1 answer

Why do we need RL in RLHF?

In RLHF, the reward function is a neural network. This means we can compute its gradients cheaply and accurately through backpropagation. Now, we want to find a policy that maximizes reward (see https://arxiv.org/abs/2203.02155). Then, why do we…
0
votes
0 answers

Negative KL-divergence RLHF implementation

I am struggling to understand one part of the FAQ of the transformer reinforcement learning library from HuggingFace: What Is the Concern with Negative KL Divergence? If you generate text by purely sampling from the model distribution things work…
0
votes
0 answers

Can pretraining be continued after RLHF?

Assume you have a pretrained transformer language model (M1) which already underwent reinforcement learning by human feedback (M2). I assume that it is in principle possible to continue the pretraining after RLHF with some additional documents, e.g.…
0
votes
1 answer

What is the difference betwen fine runing and rlhf for llm?

I am confused about the difference betwen fine runing and rlhf for llm. When to use what? I know RLHF need to creating a reward model which at furst rates responses to align the responses to the human preferences and afterward using this reward…