What can cause massive instability in validation loss?

Question

I'm working with very weird data that is apparently very hard to fit. And I've noticed a very strange phenomenon where it can go from roughly 0.0176 validation MSE to 1534863.6250 validation MSE in only 1 epoch! It usually then will return to a very low number after a few epochs. Also, no such fluctuation is seen in the training data.

This behavior of instability is consistent across shuffling, repartitioning & retraining. Even though I have 16,000 samples and a highly regularized network (dropout + residual layers + batch normalization + gradient clipping).

I mean I realize I could have more data, but, still, this behavior is really surprising. What could be causing it?

P.S. Model is feedforward with 10 layers of size [32,64,128,256,512,256,128,64,32,1], using Adam optimizer. Also, this question may be related (my experience is also periodic validation loss), but I don't think they experienced the same massive instability I am seeing.

Given that you're using the MSE, I assume you're solving a regression problem, right? It may be a good idea to also share with us how much data you are using to compute the validation loss. What's the range of your inputs and targets? — nbro, Oct 02 '21 at 14:43
The targets follow an exponential distribution but apparently can also be negative. So they can range from like -1e8 to 1e10. — profPlum, Oct 05 '21 at 23:42

What can cause massive instability in validation loss?

0 Answers0