0

I have about 2000 items in my validation set, would it be reasonable to calculate the loss/error after each epoch on just a subset instead of the whole set, if calculating the whole dataset is very slow?

Would taking random mini-batches to calculate loss be a good idea as your network wouldn't have a constant set? Should I just shrink the size of my validation set?

nbro
  • 39,006
  • 12
  • 98
  • 176
  • I would also suggest that you explain a little bit the model you're using, how much time it takes to compute the validation loss and what is the task you're trying to solve. – nbro Feb 15 '21 at 12:24

1 Answers1

1

I assume you intended to write compute the evaluation metric over the validation set in batches; you do not compute loss over the validation set!

That is quite a standard practice in many academic implementations (because, when the validation set is large enough, the memory will be a constraint), however, be sure to take the average of the values over all the batches. Using a K-fold setup will increase the confidence in the reported values.

anurag
  • 151
  • 1
  • 7