What's the difference between GPT3.5 and InstructGPT?

Question

I read about the different model series in GPT3.5 here - https://platform.openai.com/docs/models/gpt-3-5

At the beginning of the page, it mentions to look at https://platform.openai.com/docs/model-index-for-researchers to understand the difference between model series InstructGPT and GPT3.5.

But, on that page, it says InstructGPT is a part of the GPT3.5 series. So, what's the difference between GPT3.5 and InstructGPT?

score 1 · Answer 1 · answered Apr 07 '23 at 22:38

1

InstructGPTs are fine tuned GPT3.5 models. They have been tuned with different techniques to be good at following instructions and chatting. Examples of fine-tuning techniques are 1. Supervised fine-tuning on human demonstrations or 2. RLHF using PPO to maximise a reward model trained from comparisons by humans.

answered Apr 07 '23 at 22:38

Rexcirus

1,131
7
19

Hi! Thanks for your answer, this leaves me with the same question, though. If you see the first answer in this discussion: https://ai.stackexchange.com/questions/39023/are-gpt-3-5-series-models-based-on-gpt-3?newreg=d30affa2b8494f3faefd53868a57104e, the difference between InstructGPT and ChatGPT is confusing. Specifically, if the InstructGPT model does SFT, PPO, what does code-davinci-002 indicate? – Arya Apr 10 '23 at 07:01
See here https://help.openai.com/en/articles/6195637-getting-started-with-codex – Rexcirus Apr 10 '23 at 17:23

What's the difference between GPT3.5 and InstructGPT?

1 Answers1