Why do both sine and cosine have been used in positional encoding in the transformer model?

Asked Sep 12 '19 at 02:03

Active Nov 30 '21 at 15:41

Viewed 92 times

The Transformer model proposed in "Attention Is All You Need" uses sinusoid functions to do the positional encoding.

Why have both sine and cosine been used? And why do we need to separate the odd and even dimensions to use different sinusoid functions?

edited Nov 30 '21 at 15:41

nbro

asked Sep 12 '19 at 02:03

Shiyu

0 Answers0