2

enter image description here

Is this due to my dropout layers being disabled during evaluation?

I'm classifying the CIFAR-10 dataset with a CNN using the Keras library.

There are 50000 samples in the training set; I'm using a 20% validation split for my training data (10000:40000). I have 10000 instances in the test set.

Tobi
  • 135
  • 1
  • 6
  • Please, provide the size of your datasets, batch size, the specific architecture (`model.summary()`) the loss function and which accuracy metric are you falling. The validation and test accuracies are only slightly greater than the training accuracy. This can happen (e.g. due to the fact that the validation or test examples come from a distribution where the model performs actually better), although that usually doesn't happen. How many examples do you use for validation and testing? – nbro Mar 04 '20 at 14:06

1 Answers1

0

It is a bit rare that the validation and test accuracy exceed the training accuracy. One thing that could cause this is the selection of the validation and test data. Was the data for these two sets selected randomly or did you do the selection yourself? It is generally better to have these sets selected randomly from the overall data set. That way the probability distribution in these sets will closely match the distribution of the training set. Normally the training accuracy is higher (especially if you run enough epochs which I see you did) because there is always some degree of over fitting which reduces validation and test accuracy. The only other thing I can think of is the effect of Dropout layers. If you had dropout layers in your model and the drop out ratio was high that could cause this accuracy disparity. When the training accuracy is calculated it is done with drop being active. This can lower the training accuracy to some degree. However when evaluating validation accuracy and test accuracy drop out is NOT active so the model is actually more accurate. This increase in accuracy might be enough to overcome the decrease due to over fitting. Especially possible in this case since the accuracy differences appear to be quite small.

Gerry P
  • 694
  • 4
  • 10