Questions tagged [batch-learning]

For questions about machine learning algorithms that learn in batches of data rather than one example at a time (i.e. online learning). Batch learning can also be called offline learning and it is the common way of training machine learning models.

13 questions
4
votes
1 answer

Why would a VAE train much better with batch sizes closer to 1 over batch size of 100+?

I've been training a VAE to reconstruct human names and when I train it on a batch size of 100+ after about 5 hours of training it tends to just output the same thing regardless of the input and I'm using teacher forcing as well. When I use a lower…
user8714896
  • 717
  • 1
  • 4
  • 21
3
votes
1 answer

Is batch learning with gradient descent equivalent to "rehearsal" in incremental learning?

I am learning about incremental learning and read that rehearsal learning is retraining with old data. In essence, isn't this the exact same thing as batch learning (with stochastic gradient descent)? You train a model by passing in batches of data…
2
votes
0 answers

What's the most efficient way of performing batched training of Causal Language Models?

I have seen a number of ways to train (yes, train, not fine-tune) these models efficiently with batches. I will illustrate these techniques with the following example dataset and context window: Context window: ----------------- Data samples: 1.…
2
votes
2 answers

Are batches useful for REINFORCE without strong episode cutoffs?

I'm following along with PyTorch's example implementations (found here) of reinforcement learning algorithms that happen to be largely REINFORCE (vanilla policy gradient) based, and I notice they don't use batches. This leads me to ask, are batch…
2
votes
1 answer

How to sample the tuples during the initial time steps of the DDPG algorithm?

I am facing an issue in understanding the following line from the pseudocode of the DDPG algorithm Sample a random minibatch of $N$ transitions $(s_i, a_i, r_i, s_{i+1})$ from $R$ Here $N$ is a hyperparameter that is equal to the number of…
hanugm
  • 3,571
  • 3
  • 18
  • 50
2
votes
0 answers

Methodologies for passing the best samples for a neural network to learn

Just an idea I am sure I read in a book some time ago, but I can't remember the name. Given a very large dataset and a neural network (or anything that can learn via something like stochastic gradient descent, passing a subset of samples to modify…
2
votes
1 answer

Offline/Batch Reinforcement Learning: when to stop training and what agent to select

Context: My team and I are working on a RL problem for a specific application. We have data collected from user interactions (states, actions, rewards, etc.). It is too costly for us to emulate agents. We decided therefore to concentrate on Offline…
1
vote
1 answer

Batching together similar length sequences to avoid padding and packing

I am training an RNN in PyTorch to produce captions for images. It's a pretty standard architecture – the image is processed by a pre-trained InceptionV3 to extract features, the recurrent module processes the words seen so far and then its result…
1
vote
1 answer

Why does the output shape of a Dense layer contain a batch size?

I understand that the batch size is the number of examples you pass into the neural network (NN). If the batch size is 10, it means you feed the NN 10 examples at once. Assuming I have an NN with a single Dense layer. This Dense layer of 20 units…
1
vote
1 answer

What is the difference between batches in deep Q learning and supervised learning?

How is the batch loss calculated in both DQNs and simple classifiers? From what I understood, in a classifier, a common method is that you sample a mini-batch, calculate the loss for every example, calculate the average loss over the whole batch,…
0
votes
0 answers

How is it possible to use batches of data from within the same sequence with an LSTM?

ETA: More concise wording: Why do some implementations use batches of data taken from within the same sequence? Does this not make the cell state useless? Using the example of an LSTM, it has a hidden state and cell state. These states are updated…
0
votes
1 answer

Having the negative cases in the same batch vs. shuffling the dataset

I am working on a model for an NLP task. The model encodes the text and has a regression output layer. In this task, from each instance (positive), I create several negative cases using a specific technique and I merge them with their positive…
Minions
  • 123
  • 6
0
votes
1 answer

Is it okay to calculate the validation loss over batches instead of the whole validation set for speed purposes?

I have about 2000 items in my validation set, would it be reasonable to calculate the loss/error after each epoch on just a subset instead of the whole set, if calculating the whole dataset is very slow? Would taking random mini-batches to calculate…