2

I am trying to build a classifier which should be trained with the cross entropy loss. The training data is highly class-imbalanced. To tackle this, I've gone through the advice of the tensorflow docs

and now I am using a weighted cross entropy loss where the weights are calculated as

weight_for_class_a = (1 / samples_for_class_a) * total_number_of_samples/number_of_classes

following the mentioned tutorial.

It works perfectly, but why is there this factor total_number_of_samples/number_of_classes? The mentioned tutorial says this

[...] helps keep the loss to a similar magnitude.

But I don not understand why. Can someone clarify?

jmatin
  • 21
  • 2

1 Answers1

0

This comes from the fact that you want the same magnitude from the loss. Think of it this way: a non-weighted loss function actually has all its weights to 1 and so over the whole data set, samples are weighted with 1 and the sum of all weights is therefore $N$, if $N$ is the total number of samples.

Now in the case of a weighted loss, we want the weights to also sum to $N$ so that the loss's magnitude is comparable ($i = 1..C$ are your classes, $N_i$ is the number of samples for class $i$):

$$S = \sum_{i=0\ }^{C} \sum_{s_i=1}^{N_i} w_{i} = \sum_{i=0}^{i}\sum_{s_i=1}^{N_i}\frac{1}{N_i} \frac{N}{C} = \frac{N}{C} \sum_{i=0}^{C}\sum_{s_i=1}^{N_i}\frac{1}{N_i} = \frac{N}{C} \sum_{i=0}^{C}N_i\frac{1}{N_i} = \frac{N}{C} \sum_{i=0}^{C}1 = \frac{N}{C} C = N$$

ted
  • 266
  • 1
  • 4