2

Just an idea I am sure I read in a book some time ago, but I can't remember the name.

Given a very large dataset and a neural network (or anything that can learn via something like stochastic gradient descent, passing a subset of samples to modify the model, as opposed to learning from the whole dataset at once), one can train a model for, say, classification.

The idea was a methodology for selecting the samples that would make the model learn the most from, so you can spare the network from learning from examples that would make the model make only small changes, reducing computing time.

I guess an easy methodology would pick at first a sample that is similar to a previous one but with another label, and pick the most similar on features and label samples at last. Does that make sense?

Is there a googleable keyword for what I am talking about?

nbro
  • 39,006
  • 12
  • 98
  • 176
user4052054
  • 121
  • 1

0 Answers0