How do transformers handle multidimensional input?

Asked Jan 03 '23 at 23:25

Active Jan 03 '23 at 23:25

Viewed 169 times

Transformers work with lists of vectors, i.e. sentence of length SEQ_LEN, with each word having size EMBEDDING_DIM. Now, since the model still makes use of Dense layers internally, i.e. as in https://www.tensorflow.org/text/tutorials/transformer, I'm having trouble understanding how this 2D input is passed through the Dense layer, as 2D data is usually flattened before entering a Dense layer, i.e. in the case of an image?

Actually, in the general case - let's say I have sentences, with an embedding vector for each word in each sentence, how would I pass this into any layer, whether it be Dense / RNN / LSTM / etc ?

edited Jan 03 '23 at 23:25

asked Jan 03 '23 at 23:25

Daniel von Eschwege

How do transformers handle multidimensional input?

0 Answers0