How do transformers understand data and answer custom questions?

Question

I recently heard of GPT-3 and I don't understand how the attention models and transformers encoders and decoders work. I heard that GPT-3 can make a website from a description and write perfectly factual essays. How can it understand our world using algorithms and then recreate human-like content? How can it learn to understand a description and program in HTML?

It can't "understand" anything. It just matches patterns to something it encountered in the training data. — Oliver Mason, May 07 '21 at 16:00

score 1 · Answer 1 · edited May 08 '21 at 01:08

1

GPT-3 (and the likes) don't really have any understanding of the semantics nor pragmatics involved in the language. However, they are good at constructing text content similar to the contents created by a person (when the texts and the concepts are not too "complicated").

edited May 08 '21 at 01:08

nbro

39,006
12
98
176

answered May 07 '21 at 16:04

Raul Alvarez

122
1
12

But how do the different layers and math formulas work to create this type of intelligence? – DragonflyRobotics May 07 '21 at 19:13
There are many questions regarding this issue on this platform, though I wouldn't call it "intelligence". Attention is all you need.... – Raul Alvarez May 07 '21 at 20:18
So then what is "Attention". I simply want to understand the working principle of GPT-3 – DragonflyRobotics May 07 '21 at 21:10
@DragonflyRobotics We don't have a great idea, we know the general principle of neural networks is to make something that can in principle do the computation, and then spend hundreds of thousands of dollars (in the worst case) renting GPUs to tweak it quadrillions of times until it performs well. – user253751 Aug 10 '21 at 16:31
1

@DragonflyRobotics The attention unit is, AFAIK, the major contribution of the transformers paper. If you want to know what attention is then I suggest you focus your question around that. – user253751 Aug 10 '21 at 16:32

Raul Alvarez · Accepted Answer · 2023-04-27T09:43:45.607

Two years after the original question was made I feel it is time to provide an updated answer:

How can it understand our world using algorithms and then recreate human-like content?

GPT-3, GPT-4 (transformers, Chat-GPTs and the likes) do not understand our world. They do provide answers based on any info available on the "training datasets" and on any additional "training dataset" used to "fine-tune" the model. Those answers recreate human-like content even if the models/algorithms do not understand nor have a representation of the world.

How do transformers understand data and answer custom questions?

2 Answers2