2

In the transformer model, to incorporate positional information of texts, the researchers have added a positional encoding to the model. How does positional encoding work? How does the positional encoding system learn the positions when varying lengths and types of text are passed at different time intervals?

To be more concrete, let's take these two sentences.

  1. "She is my queen"
  2. "Elizabeth is the queen of England"

How would these sentences be passed to the transformer? What would happen to them during the positional encoding part?

Please explain with less math and with more intuition behind it.

nbro
  • 39,006
  • 12
  • 98
  • 176
Eka
  • 1,036
  • 8
  • 23
  • There already two other similar questions on this site (both of them with no answer): [this](https://ai.stackexchange.com/q/10989/2444) and [this](https://ai.stackexchange.com/q/15386/2444). – nbro Sep 17 '20 at 13:25

0 Answers0