Questions tagged [open-ai]

For questions related to the openAI, including the Gym toolkit.

OpenAI is a non-profit artificial intelligence (AI) research company that aims to promote and develop friendly AI in such a way as to benefit humanity as a whole. The organization aims to "freely collaborate" with other institutions and researchers by making its patents and research open to the public. The founders (notably Elon Musk and Sam Altman) are motivated in part by concerns about existential risk from artificial general intelligence.
SOURCE: OpenAI (wikipedia)

https://openai.com/

https://gym.openai.com/

104 questions
26
votes
1 answer

What exactly are the "parameters" in GPT-3's 175 billion parameters and how are they chosen/generated?

When I studied neural networks, parameters were learning rate, batch size etc. But even GPT3's ArXiv paper does not mention anything about what exactly the parameters are, but gives a small hint that they might just be sentences. Even tutorial…
Nav
  • 481
  • 1
  • 5
  • 10
9
votes
1 answer

How do I use GPT-2 to summarise text?

In section 3.6 of the OpenAI GPT-2 paper it mentions summarising text based relates to this, but the method is described in very high-level terms: To induce summarization behavior we add the text TL;DR: after the article and generate 100 tokens…
Tom Hale
  • 364
  • 3
  • 11
8
votes
2 answers

Is GPT-4 based on GPT-3 or was it trained from the scratch?

To me it looks like GPT-4 is based on GPT-3. On the other hand, there were rumors that training of GPT-3 was done with errors, but re-train was impossible due to the costs.
Anixx
  • 301
  • 8
8
votes
2 answers

Are GPT-3.5 series models based on GPT-3?

In the official blog post about ChatGPT from OpenAI, there is this paragraph explaining how ChatGPT model was trained: We trained this model using Reinforcement Learning from Human Feedback (RLHF), using the same methods as InstructGPT, but with…
iMad
  • 183
  • 4
7
votes
4 answers

OpenAI ChatGPT gives a network error at long responds. How can I fix it?

When OpenAI's ChatGPT replies with a very long answer, it will return a network error. When you check the network console, the POST request will fail with a ERR_HTTP2_PROTOCOL_ERROR: The time of the "crash" is around 1 minute.
Lars Flieger
  • 187
  • 1
  • 1
  • 5
7
votes
1 answer

2 Player Games in OpenAI Retro

I have been using OpenAI Retro for awhile, and I wanted to experiment with two player games. By two player games, I mean co-op games like "Tennis-Atari2600" or even Pong, where 2 agents are present in one environment. There is a parameter for…
6
votes
2 answers

My Deep Q-Learning Network does not learn for OpenAI gym's cartpole problem

I am implementing OpenAI gym's cartpole problem using Deep Q-Learning (DQN). I followed tutorials (video and otherwise) and learned all about it. I implemented a code for myself and I thought it should work, but the agent is not learning. I will…
SJa
  • 371
  • 2
  • 15
5
votes
2 answers

Would self-hosting ChatGPT be feasible, w.r.t. computation costs?

Suppose the pre-trained, current date (2023-02-04) ChatGPT model was released open source, would it be feasible for regular users to interact with the model on a self-hosted computer? Assumptions I assume getting output based on some input is, at…
a.t.
  • 233
  • 1
  • 6
5
votes
2 answers

InstructGPT: What is the sigma in the loss function and why $\log(\cdot)$ is being used?

InstructGPT: What is the sigma in the loss function and why $\log(\cdot)$ is being used? $$ \operatorname{loss}(\theta) = -\frac{1}{\binom{K}{2}}E_{(x,y_w,y_l)\sim D}[\log(\sigma(r_{\theta}(x, y_w) - r_{\theta}(x, y_l)))] $$ The equation was taken…
5
votes
1 answer

Why does ChatGPT create fake code?

ChatGPT has been a big thing lately. It also makes a lot of mistakes. For example, it creates fake functions of a package and tells it as it works for real. I was wondering how that works. Why is it creating fake functions of code and not just…
5
votes
1 answer

How to define an action space when an agent can take multiple sub-actions in a step?

I'm attempting to design an action space in OpenAI's gym and hitting the following roadblock. I've looked at this post which is closely related but subtly different. The environment I'm writing needs to allow an agent to make between $1$ and $n$…
5
votes
1 answer

How powerful is OpenAI's Gym and Universe in board games area?

I'm a big fan of computer board games and would like to make Python chess/go/shogi/mancala programs. Having heard of reinforcement learning, I decided to look at OpenAI Gym. But first of all, I would like to know, is it possible using OpenAI…
Taissa
  • 63
  • 4
4
votes
1 answer

What's the difference between GPT3.5 and InstructGPT?

I read about the different model series in GPT3.5 here - https://platform.openai.com/docs/models/gpt-3-5 At the beginning of the page, it mentions to look at https://platform.openai.com/docs/model-index-for-researchers to understand the difference…
Arya
  • 41
  • 2
4
votes
0 answers

How is ChatGPT maintaining context?

It has been suggested in the answer to this earlier question that it is just remembering a certain amount of recent information. The reference used is this post by OpenAI which says that ChatGPT should only be able to maintain a context of around…
4
votes
2 answers

How does ChatGPT respond to novel prompts and commands?

So I understand how a language model could scan a large data set like the internet and produce text that mimicked the statistical properties of the input data, eg completing a sentence like "eggs are healthy because ...", or producing text that…
1
2 3 4 5 6 7