Highest Voted 'batch-learning' Questions - Artificial Intelligence Stack Exchange

4

votes

1 answer

Why would a VAE train much better with batch sizes closer to 1 over batch size of 100+?

I've been training a VAE to reconstruct human names and when I train it on a batch size of 100+ after about 5 hours of training it tends to just output the same thing regardless of the input and I'm using teacher forcing as well. When I use a lower…

variational-autoencoder batch-size batch-learning

asked Nov 19 '20 at 12:51

user8714896

717
1
4
21

3

votes

1 answer

Is batch learning with gradient descent equivalent to "rehearsal" in incremental learning?

I am learning about incremental learning and read that rehearsal learning is retraining with old data. In essence, isn't this the exact same thing as batch learning (with stochastic gradient descent)? You train a model by passing in batches of data…

neural-networks deep-learning gradient-descent incremental-learning batch-learning

asked Sep 07 '20 at 22:43

JobHunter69

161
5

2

votes

0 answers

What's the most efficient way of performing batched training of Causal Language Models?

I have seen a number of ways to train (yes, train, not fine-tune) these models efficiently with batches. I will illustrate these techniques with the following example dataset and context window: Context window: ----------------- Data samples: 1.…

training transformer gpt language-model batch-learning

asked Mar 28 '23 at 07:40

thesofakillers

309
2
14

2

votes

2 answers

Are batches useful for REINFORCE without strong episode cutoffs?

I'm following along with PyTorch's example implementations (found here) of reinforcement learning algorithms that happen to be largely REINFORCE (vanilla policy gradient) based, and I notice they don't use batches. This leads me to ask, are batch…

reinforcement-learning deep-rl reinforce batch-learning

asked Sep 20 '22 at 23:42

Josh

89
9

2

votes

1 answer

How to sample the tuples during the initial time steps of the DDPG algorithm?

I am facing an issue in understanding the following line from the pseudocode of the DDPG algorithm Sample a random minibatch of $N$ transitions $(s_i, a_i, r_i, s_{i+1})$ from $R$ Here $N$ is a hyperparameter that is equal to the number of…

deep-rl implementation ddpg experience-replay batch-learning

asked Jul 03 '22 at 12:38

hanugm

3,571
3
18
50

2

votes

0 answers

Methodologies for passing the best samples for a neural network to learn

Just an idea I am sure I read in a book some time ago, but I can't remember the name. Given a very large dataset and a neural network (or anything that can learn via something like stochastic gradient descent, passing a subset of samples to modify…

neural-networks stochastic-gradient-descent batch-learning

asked Jul 23 '21 at 15:59

user4052054

121
1

2

votes

1 answer

Offline/Batch Reinforcement Learning: when to stop training and what agent to select

Context: My team and I are working on a RL problem for a specific application. We have data collected from user interactions (states, actions, rewards, etc.). It is too costly for us to emulate agents. We decided therefore to concentrate on Offline…

training q-learning off-policy-methods batch-learning offline-reinforcement-learning

asked Jan 13 '21 at 11:17

MetaHG

21
3

1

vote

1 answer

Batching together similar length sequences to avoid padding and packing

I am training an RNN in PyTorch to produce captions for images. It's a pretty standard architecture – the image is processed by a pre-trained InceptionV3 to extract features, the recurrent module processes the words seen so far and then its result…

natural-language-processing recurrent-neural-networks pytorch training-datasets batch-learning

asked Oct 09 '22 at 16:33

czypsu

111
2

1

vote

1 answer

Why does the output shape of a Dense layer contain a batch size?

I understand that the batch size is the number of examples you pass into the neural network (NN). If the batch size is 10, it means you feed the NN 10 examples at once. Assuming I have an NN with a single Dense layer. This Dense layer of 20 units…

neural-networks keras hidden-layers dense-layers batch-learning

asked Aug 26 '20 at 15:18

jaksnak

13
3

1

vote

1 answer

What is the difference between batches in deep Q learning and supervised learning?

How is the batch loss calculated in both DQNs and simple classifiers? From what I understood, in a classifier, a common method is that you sample a mini-batch, calculate the loss for every example, calculate the average loss over the whole batch,…

reinforcement-learning deep-learning backpropagation objective-functions batch-learning

asked Jan 15 '20 at 09:48

Voß

99
8

0

votes

0 answers

How is it possible to use batches of data from within the same sequence with an LSTM?

ETA: More concise wording: Why do some implementations use batches of data taken from within the same sequence? Does this not make the cell state useless? Using the example of an LSTM, it has a hidden state and cell state. These states are updated…

machine-learning recurrent-neural-networks long-short-term-memory pytorch batch-learning

asked Jul 20 '22 at 06:24

Recessive

1,346
8
21

0

votes

1 answer

Having the negative cases in the same batch vs. shuffling the dataset

I am working on a model for an NLP task. The model encodes the text and has a regression output layer. In this task, from each instance (positive), I create several negative cases using a specific technique and I merge them with their positive…

neural-networks python datasets batch-learning

asked Mar 15 '22 at 17:05

Minions

123
6

0

votes

1 answer

Is it okay to calculate the validation loss over batches instead of the whole validation set for speed purposes?

I have about 2000 items in my validation set, would it be reasonable to calculate the loss/error after each epoch on just a subset instead of the whole set, if calculating the whole dataset is very slow? Would taking random mini-batches to calculate…

machine-learning loss batch-learning validation-loss

asked Feb 15 '21 at 12:17

Ilknur Mustafa

115
4

Questions tagged [batch-learning]