2

I noticed that there are many studies in recent years on how to train/update neural networks faster/quicker with equal or better performance. I find the following methods(except the chips arms race):

  1. using few-shot learning, for instance, pre-taining and etc.
  2. using the minimum viable dataset, for instance using (guided) progressive sampling.
  3. model compression, for instance, efficent transformers
  4. Data echoing, or simply put let the data pass multiple times in the graph(or GPU)

Is there a systematic structure on this topic and how can we update or train a model faster without loss of its capacity?

Lerner Zhang
  • 877
  • 1
  • 7
  • 19
  • use validation set maybe? – Alex Jan 16 '21 at 11:27
  • @Alex What do you mean by validation set? I cannot get your point, sorry. Could you please flesh your comment out into an answer? – Lerner Zhang Jan 16 '21 at 11:33
  • Use val data to check if you are overfitting (in addition to other tricks), this will prevent the degradation of the model's performance – Alex Jan 16 '21 at 11:40
  • @Alex Yes, but let's consider the model that takes N epochs to get the best performance over the develop set, can we speed the training by other methods? Make the N smaller than that for the original model or less time? For instance by reducing the amount of training data or compressing the model size? – Lerner Zhang Jan 16 '21 at 11:43
  • what do you mean by 'compressing the model size'? – Alex Jan 16 '21 at 11:55
  • @Alex Yes, using techniques like knowledge distilling or pruning and etc. – Lerner Zhang Jan 16 '21 at 12:03

0 Answers0