4

I read about the different model series in GPT3.5 here - https://platform.openai.com/docs/models/gpt-3-5

At the beginning of the page, it mentions to look at https://platform.openai.com/docs/model-index-for-researchers to understand the difference between model series InstructGPT and GPT3.5.

But, on that page, it says InstructGPT is a part of the GPT3.5 series. So, what's the difference between GPT3.5 and InstructGPT?

nbro
  • 39,006
  • 12
  • 98
  • 176
Arya
  • 41
  • 2

1 Answers1

1

InstructGPTs are fine tuned GPT3.5 models. They have been tuned with different techniques to be good at following instructions and chatting. Examples of fine-tuning techniques are 1. Supervised fine-tuning on human demonstrations or 2. RLHF using PPO to maximise a reward model trained from comparisons by humans.

Rexcirus
  • 1,131
  • 7
  • 19
  • Hi! Thanks for your answer, this leaves me with the same question, though. If you see the first answer in this discussion: https://ai.stackexchange.com/questions/39023/are-gpt-3-5-series-models-based-on-gpt-3?newreg=d30affa2b8494f3faefd53868a57104e, the difference between InstructGPT and ChatGPT is confusing. Specifically, if the InstructGPT model does SFT, PPO, what does code-davinci-002 indicate? – Arya Apr 10 '23 at 07:01
  • See here https://help.openai.com/en/articles/6195637-getting-started-with-codex – Rexcirus Apr 10 '23 at 17:23