How does positional encoding work in the transformer model?

Asked Mar 05 '20 at 11:54

Active Aug 17 '22 at 07:08

Viewed 142 times

In the transformer model, to incorporate positional information of texts, the researchers have added a positional encoding to the model. How does positional encoding work? How does the positional encoding system learn the positions when varying lengths and types of text are passed at different time intervals?

To be more concrete, let's take these two sentences.

"She is my queen"
"Elizabeth is the queen of England"

How would these sentences be passed to the transformer? What would happen to them during the positional encoding part?

Please explain with less math and with more intuition behind it.

edited Nov 30 '21 at 15:41

nbro

39,006
12
98
176

asked Mar 05 '20 at 11:54

Eka

1,036
8
23

There already two other similar questions on this site (both of them with no answer): [this](https://ai.stackexchange.com/q/10989/2444) and [this](https://ai.stackexchange.com/q/15386/2444). – nbro Sep 17 '20 at 13:25

How does positional encoding work in the transformer model?

0 Answers0