Highest Voted 'sequence-modeling' Questions - Artificial Intelligence Stack Exchange

23

votes

3 answers

Can the decoder in a transformer model be parallelized like the encoder?

Can the decoder in a transformer model be parallelized like the encoder? As far as I understand, the encoder has all the tokens in the sequence to compute the self-attention scores. But for a decoder, this is not possible (in both training and…

asked May 23 '19 at 15:36

shiredude95

333
2
6

8

votes

4 answers

How can I predict the next number in a non-obvious sequence?

I've got an array of integers ranging from -3 to +3. Example: [1, 3, -2, 0, 0, 1] The array has no obvious pattern since it represents bipolar disorder mood swings. What is the most suitable approach to predict the next number in the series? The…

machine-learning recurrent-neural-networks prediction sequence-modeling

asked Oct 04 '21 at 11:06

ZenBerry

183
1
1
4

6

votes

2 answers

What evaluation metric are used for sequence-to-sequence prediction problems?

I am solving many sequence-to-sequence prediction problems using RNN/LSTM. What type of evaluation metrics can be used for sequence prediction problems? One metric is the mean squared error (MSE) that we can give as a parameter during the training…

recurrent-neural-networks long-short-term-memory sequence-modeling metric

asked Nov 14 '19 at 07:30

Asif Khan

181
1
6

6

votes

3 answers

In sequence-to-sequence, why is the output of the decoder used as its input?

The basic seq-2-seq model consists of 2 parts: a recurrent encoder that compresses a sequence to a vector and decoder that unrolls the vector into the output sequence: Why is the output, w, x, y, z of the decoder used as its input? Shouldn't the…

recurrent-neural-networks sequence-modeling

asked Jul 02 '19 at 23:02

user8426627

358
1
11

5

votes

2 answers

Why do Transformers have a sequence limit at inference time?

As far as I understand, Transformer's time complexity increases quadratically with respect to the sequence length. As a result, during training to make training feasible, a maximum sequence limit is set, and to allow batching, all sequences smaller…

machine-learning natural-language-processing transformer architecture sequence-modeling

asked Nov 26 '21 at 15:32

chessprogrammer

2,215
2
12
23

5

votes

1 answer

Why do small datasets require more samples, while big datasets require fewer samples in negative sampling?

In the deep learning specialization course by Andrew Ng, in the video Sequence Models (minute 4:13), he says that in negative sampling we have to choose a sample of words from the corpus to train rather than choosing the whole corpus. But he said…

recurrent-neural-networks sequence-modeling

asked Oct 29 '19 at 20:49

A_the_kunal

61
3

4

votes

1 answer

Why do we need both encoder and decoder in sequence to sequence prediction?

Why do we need both encoder and decoder in sequence to sequence prediction? We could just have a single RNN that, given input $x$, outputs some value $y(t)$ and hidden state $h(t)$. Next, given $h(t)$ and $y(t)$, the next output $y(t+1)$ and hidden…

machine-learning ai-design sequence-modeling encoder-decoder

asked Dec 03 '18 at 16:48

greensquare

61
3

4

votes

1 answer

Can Reinforcement Learning be used to generate sequences?

Can we use reinforcement learning for sequence-to-sequence tasks? If yes, whether or not this is a good choice, how could this be done?

reinforcement-learning reference-request applications sequence-modeling seq2seq

asked May 26 '21 at 17:03

penguin_smasher

41
3

4

votes

1 answer

How can I use machine learning to predict properties (such as the area) of simple polygons?

Imagine a set of simple (non-self-intersecting) polygons given by the coordinate pairs of their vertices $[(x_1, y_1), (x_2, y_2), \dots,(x_n, y_n)]$. The polygons in the set have a different number of vertices. How can I use machine learning to…

machine-learning reference-request sequence-modeling

asked Apr 09 '20 at 13:46

Vladislav Gladkikh

472
5
14

4

votes

0 answers

Can sequence-to-sequence models be used to convert source code from one programming language to another?

Sequence-to-sequence models have achieved good performance in natural language translation. Could these models also be applied to convert source code written in one programming language to source code written in another language? Could they also be…

deep-learning sequence-modeling machine-translation

asked Dec 17 '19 at 10:49

pfds2222

41
1

3

votes

2 answers

Is seq2seq the best model when input/output sequences have fixed length?

I understand that seq2seq models are perfectly suitable when the input and/or the output have variable lengths. However, if we know exactly the input/output sequence lengths of the neural network. Is this the best approach?

time-series sequence-modeling seq2seq

asked Aug 09 '21 at 11:23

Petrus

31
1

3

votes

2 answers

How to use LSTM to generate a paragraph

A LSTM model can be trained to generate text sequences by feeding the first word. After feeding the first word, the model will generate a sequence of words (a sentence). Feed the first word to get the second word, feed the first word + the second…

deep-learning long-short-term-memory sequence-modeling text-classification text-generation

asked Jan 02 '20 at 03:11

Dee

1,283
1
11
35

2

votes

1 answer

Can the recurrent neural network's input come from a short-time Fourier transform?

Can the recurrent neural network input come from a short-time Fourier transform? I mean the input is not from the time-series domain.

recurrent-neural-networks sequence-modeling fourier-transform

asked Oct 09 '18 at 02:19

user18884

21
2

2

votes

1 answer

Difference between dot product attention and "matrix attention"

As far as I know, attention was first introduced in Learning To Align And Translate. There, the core mechanism which is able to disregard the sequence length, is a dynamically-built matrix, of shape output_size X input_size, in which every position…

papers transformer attention sequence-modeling

asked Apr 16 '23 at 10:16

Gulzar

729
1
8
23

2

votes

1 answer

Sequence Embedding using embedding layer: how does the network architecture influence it?

I want to obtain a dense vector representation of protein sequences so that I can meaningfully represent them in an embedding space. We can consider them as sequences of letters, in particular there are 21 unique symbols which are the amino acids…

deep-learning natural-language-processing long-short-term-memory sequence-modeling embeddings

asked Feb 08 '22 at 10:49

HelpNeederStudent

145
8

Questions tagged [sequence-modeling]