2

I am practicing with Resnet50 fine-tuning for a binary classification task. Here is my code snippet.

base_model = ResNet50(weights='imagenet', include_top=False)
x = base_model.output
x = keras.layers.GlobalAveragePooling2D(name='avg_pool')(x)
x = Dropout(0.8)(x)
model_prediction = keras.layers.Dense(1, activation='sigmoid', name='predictions')(x)
model = keras.models.Model(inputs=base_model.input, outputs=model_prediction)
opt = SGD(lr = 0.01, momentum = 0.9, nesterov = False)
 
model.compile(loss='binary_crossentropy', optimizer=opt, metrics=['accuracy'])  #
   
train_datagen = ImageDataGenerator(rescale=1./255, shear_range=0.2, zoom_range=0.2, horizontal_flip=False)
  
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
        './project_01/train',
        target_size=(input_size, input_size),  
        batch_size=batch_size,
        class_mode='binary')    

validation_generator = test_datagen.flow_from_directory(
        './project_01/val',
        target_size=(input_size, input_size),
        batch_size=batch_size,
        class_mode='binary')

hist = model.fit_generator(
        train_generator,
        steps_per_epoch= 1523 // batch_size, # 759 + 764 NON = 1523
        epochs=epochs,
        validation_data=validation_generator,
        validation_steps= 269 // batch_size)  # 134 + 135NON = 269

I plotted a figure of the model after training for 50 epochs:

You may have noticed that train_acc and val_acc have highly fluctuated, and train_acc merely reaches 52%, which means that network isn't learning, let alone over-fitting the data.

As for the losses, I haven't got any insights.

Before training starts, network outputs:

Found 1523 images belonging to 2 classes.
Found 269 images belonging to 2 classes.

Is my fine-tuned model learning anything at all?

I'd appreciate if someone can guide me to solve this issue.

nbro
  • 39,006
  • 12
  • 98
  • 176
bit_scientist
  • 241
  • 1
  • 4
  • 15
  • Are you basing this code on some existing script? How did you decide to use a dropout rate of `0.8`, and a learning rate of `0.01`? (Dropout rate is very high, learning rate probably too high). What is your batch size? – Mathias Müller Jan 29 '20 at 10:06
  • @MathiasMüller those are from other model tests that I've been training. Actually I tried several hyper-parameters. For now, VGG16 model is performing quite well. – bit_scientist Mar 26 '20 at 11:28

1 Answers1

0

It's difficult to say without knowing what your data looks like but from the numbers it seems too less and the images might be too similar to one another or very different. In any case, I'd have checked using other networks like Inception and decreasing learning rate even further (say 0.0001) to not mess with the Imagenet weights if your data is not very different from Imagenet classes.

deadcode
  • 101
  • many thanks, my images are similar to one another, just like these [signal images](https://stackoverflow.com/questions/59476427/removing-background-noise-from-signal-images-rgb). Can you give any more updates? – bit_scientist Dec 28 '19 at 10:48
  • In this case the number of images is highly inadequate imo. I'd try to make a custom network first with very few layers and see if there's any improvement. Huge architectures like Resnet, etc. don't generally work with small data. – deadcode Dec 30 '19 at 11:53
  • how would you deal with weight initialization in case there is high similarity between the 2 classes and the number of images are around 1k for per class? – bit_scientist Mar 26 '20 at 11:28