Highest Voted 'language-model' Questions - Artificial Intelligence Stack Exchange

46

votes

3 answers

Was ChatGPT trained on Stack Overflow data?

Has ChatGPT used highly rated and upvoted questions/answers from Stack Overflow in its training data? For me it makes complete sense to take answers that have upwards of 100 upvotes and include them in your training data, but people around me seem…

chat-bots chatgpt language-model

asked Jan 08 '23 at 11:27

Nicolas Zein

561
1
2
3

27

votes

1 answer

What is the "temperature" in the GPT models?

What does the temperature parameter mean when talking about the GPT models? I know that a higher temperature value means more randomness, but I want to know how randomness is introduced. Does temperature mean we add noise to the weights/activations…

machine-learning terminology gpt language-model gpt-3

asked Nov 21 '21 at 01:34

Tom Dörr

393
1
3
7

23

votes

4 answers

How does ChatGPT know math?

ChatGPT is a language model. As far as I know and If I'm not wrong, it gets text as tokens and word embeddings. So, how can it do math? For example, I asked: ME: Which one is bigger 5 or 9. ChatGPT: In this case, 9 is larger than 5. One can say,…

math chat-bots natural-language-understanding language-model chatgpt

asked Dec 08 '22 at 18:43

Peyman

534
3
10

13

votes

2 answers

Why does ChatGPT not give the answer text all at once?

When ChatGPT is generating an answer to my question, it generates it word by word. So I actually have to wait until I get the final answer. Is this just for show? Or is it really real-time generating the answer word by word not knowing yet what the…

natural-language-processing language-model chatgpt

asked Jan 27 '23 at 15:18

Sander van den Oord

231
1
5

8

votes

1 answer

What causes ChatGPT to generate responses that refer to itself as a bot or LM?

ChatGPT occasionally generates responses to prompts that refer to itself as a "bot" or "language model." For instance, when given a certain input (the first paragraph of this question) ChatGPT produces (in part) the output: It is not appropriate…

chat-bots training-datasets language-model gpt-3 chatgpt

asked Dec 16 '22 at 08:58

Obie 2.0

183
6

7

votes

2 answers

Why can't language models, like GPT-3, continuously learn once trained?

GPT-3 has a prompt limit of about ~2048 "tokens", which corresponds to about 4 characters in text. If my understanding is correct, a deep neural network is not learning after it is trained and is used to produce an output, and, as such, this…

deep-learning language-model

asked Oct 08 '22 at 15:08

MaiaVictor

355
1
9

7

votes

1 answer

How to use BERT as a multi-purpose conversational AI?

I'm looking to make an NLP model that can achieve a dual purpose. One purpose is that it can hold interesting conversations (conversational AI), and another being that it can do intent classification and even accomplish the classified task. To…

natural-language-processing classification bert language-model

asked Dec 14 '19 at 13:55

junfanbl

323
1
7

6

votes

1 answer

How was ChatGPT trained?

I know that large language models like GPT-3 are trained simply to continue pieces of text that have been scraped from the web. But how was ChatGPT trained, which, while also having a good understanding of language, is not directly a language model,…

natural-language-processing chat-bots training-datasets language-model chatgpt

asked Dec 29 '22 at 01:02

HelloGoodbye

313
1
11

6

votes

1 answer

What are pros and cons of Bi-LSTM as compared to LSTM?

What are the pros and cons of LSTM vs Bi-LSTM in language modelling? What was the need to introduce Bi-LSTM?

natural-language-processing comparison long-short-term-memory language-model bidirectional-lstm

asked Mar 23 '20 at 07:41

DRV

1,573
2
11
18

5

votes

2 answers

Where can I find pre-trained language models in English and German?

Where can I find (more) pre-trained language models? I am especially interested in neural network-based models for English and German. I am aware only of Language Model on One Billion Word Benchmark and TF-LM: TensorFlow-based Language Modeling…

neural-networks natural-language-processing bert gpt language-model

asked Aug 23 '18 at 07:17

Lutz Büch

161
7

5

votes

1 answer

How does GPT-based language model like ChatGPT determine the n-th letter of a word?

I understand that GPT models process input text by converting words into tokens and then embedding vectors and do not process them letter by letter. Given this approach, I am curious to know how a model like ChatGPT can identify the first (or n-th)…

natural-language-processing chatgpt gpt natural-language-understanding language-model

asked Apr 23 '23 at 02:30

Peyman

534
3
10

5

votes

2 answers

How is the next token predicted in transformers?

In the transformer (or GPT/decoder only), at the end of the decoder blocks but before the final linear layer you have X vectors (for the X tokens at the input of the decoder). We then want to compute the probabilities for the next token of the…

natural-language-processing transformer gpt language-model

asked Apr 21 '23 at 00:48

Miguel Carvalho

51
1

5

votes

1 answer

How can a language model keep track of the provenance of the main knowledge/sources used to generate a given output?

One of the main criticisms against the use of ChatGPT on Stack Exchange is that it doesn't attribute the main knowledge/sources used to generate a given output. How can a language model keep track of the provenance of the main knowledge/sources used…

natural-language-processing language-model chatgpt

asked Dec 16 '22 at 18:12

Franck Dernoncourt

2,626
1
19
31

5

votes

2 answers

What is the difference between a language model and a word embedding?

I am self-studying applications of deep learning on the NLP and machine translation. I am confused about the concepts of "Language Model", "Word Embedding", "BLEU Score". It appears to me that a language model is a way to predict the next word given…

natural-language-processing comparison word-embedding language-model bleu

asked Mar 09 '21 at 21:43

Exploring

223
6
16

4

votes

2 answers

What makes reproducing a model like GPT3/GPT3.5/ChatGPT difficult?

Is it difficult for other companies to train a model similar to ChatGPT, and what makes it difficult? What is challenging about reproducing the results obtained by OpenAI with ChatGPT/GPT3.5? Would it be possible for a company like Meta or Google to…

training language-model large-language-models

asked Jan 25 '23 at 07:33

Robin van Hoorn

1,810
7
32

Questions tagged [language-model]