2

I'm training a deep network in Keras on some images for a binary classification (I have around 12K images). Once in a while, I collect some false positives and add them to my training sets and re-train for higher accuracy.

I split my training into 20/80 percent for training/validation sets.

Now, my question is: which resulting model should I use? Always the one with higher validation accuracy, or maybe the higher mean of training and validation accuracy? Which one of the two would you prefer?

Epoch #38: training acc: 0.924, validation acc: 0.944
Epoch #90: training acc: 0.952, validation acc: 0.932
nbro
  • 39,006
  • 12
  • 98
  • 176
Tina J
  • 973
  • 6
  • 13

2 Answers2

2

Neither of the above mentioned methods could be a potent indicator of the performance of a model.

A simple way to train the model just enough so that it generalizes well on unknown datasets would be to monitor the validation loss. Training should be stopped once the validation loss progressively starts increasing over multiple epochs. Beyond this point, the model learns the statistical noise within the data and starts overfitting.

enter image description here

This technique of Early stopping could be implemented in Keras with the help of a callback function:

class EarlyStop(tf.keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs={}):
        if(logs.get('val_loss') < LOSS_THRESHOLD and logs.get('val_categorical_accuracy') > ACCURACY_THRESHOLD):
            self.model.stop_training = True

callbacks= EarlyStop()
model.fit(...,callbacks=[callbacks])

The Loss and Accuracy thresholds can be estimated after a trial run of the model by monitoring the validation/training error graph.

s_bh
  • 360
  • 1
  • 5
  • So I should track (and log) the `val_loss` metric instead of `val_acc` in the name of output model. Is that right? And choose the lowest one. – Tina J Feb 08 '20 at 01:28
  • 2
    @TinaJ Yes, that would be better indicator of the model performance. Validation accuracy may fluctuate throughout the training procedure (a high validation accuracy reached in the initial epochs could be just a fluke, signifying little about the predictive power of the model). – s_bh Feb 08 '20 at 01:52
  • umm I re-train my model once in a while with added data. I see `val_acc` increases but when I test the model expecting better accuracy, it actually behaves poorlier than before. Is it because the `val_loss` was probably higher than before? – Tina J Feb 08 '20 at 04:21
0

The training accuracy tells you nothing about how good it is on other data than the ones it learned on, it could be better on this data because it memorized this examples.

On the other hand the validation set is here to indicate you how good the model is to generalize what it learned to new data (hopefully the testing dataset accurately represents the diversity of the data).

As you are looking for a model which is good on every dataset you don't want to use training accuracy to choose your model and so you should choose the first one.

kirua
  • 424
  • 3
  • 11