Why do I get higher average dice accuracy for less data

Question

I am working on image segmentation of MRI thigh images with deep learning (Unet). I noticed that I get a higher average dice accuracy over my predicted masks if I have less samples in the test data set. I am calculating it in tensorflow as

def dice_coefficient(y_true, y_pred, smooth=0.00001):
y_true_f = K.flatten(y_true)
y_pred_f = K.flatten(y_pred)
intersection = K.sum(y_true_f * y_pred_f)
return (2. * intersection + smooth) / (K.sum(y_true_f) + K.sum(y_pred_f) + smooth)

the difference is 0.003 if I have 4x more samples.

I am calculating the dice coefficient over each MRI 2D slice

Why could this be?

This figure shows how the accuracy decreases with the fraction of samples. I start with 0.1 of the data until the whole data set. The splitting of the data was random

How did you create the subset of your test data (to generate the "less samples" group)? Did you sample a different group every time, and re-tested the accuracy each time? Exactly how much did the accuracy improve, and what is the variance of this accuracy as the subset test data varies? — user3667125, Dec 17 '20 at 01:58
well, I noticed that the validation accuracy was always higher by a small gap (around 0.003 of the dice ratio) I removed the dropout and it was still the same. So I did a crossvalidation and divided the sets randomly and it was the same. I also just manually divided a test dataset into 10% and 90% and saw that the accuracy was slightly bigger for the smaller portion — Lis Louise, Dec 17 '20 at 15:48
Not sure if I read it right, but if the difference in validation accuracy is only by a factor of 0.003 then it's very small and likely a statistically insignificant difference (probably due to some sort of variance somewhere). If you'd still like to dive deeper into this, I think would be interesting to graph accuracy vs data size and test many different data sizes. That way you can see if there is a clear curve/pattern which mean there's something meaningful, or if it is just a flat line that fluctuates up and down which means its just randomness/variance — user3667125, Dec 18 '20 at 22:39
Well, this gap is always there between the validation and training set, independently of the random splitting. I just uploaded a graph accuracy vs data size — Lis Louise, Dec 21 '20 at 18:19

Why do I get higher average dice accuracy for less data

0 Answers0