1

I have trained a CNN in a binary classification problem, however the original problem has 6 different classes, of which, I am only interested in classifying one, so if it is that certain class or not.in this case, let's say class 2.

After looking closely into the model's performance on test dataset, I have found that the model confuses class 2 with class 1 often. Is it common practice, to make a balanced dataset from the data that I have only from class 1 and class 2, and further train the model on that dataset? Are there any pieces of research/papers on this? If no, what other possible solutions would there be, of course other than making a new model?

NeuroEng
  • 121
  • 4

1 Answers1

1

You can use the technique of Transfer Learning to fine-tune your model. You can take the weights from the pre-trained model and then use them as initializations for your own model.

Yes!, It is common practice to make a balanced dataset when training a machine learning model, as this can help prevent the model from overfitting to one class or the other. A model trained on a balanced dataset would be more accurate at predicting the classes.

In your case, if you have more data from class 1 than class 2, you may want to downsample the class 1 data so that both classes are represented equally. You can then train your model on this balanced dataset and see if it improves performance on the test set.

There is a great deal of research has been conducted on the topic of data balancing and its impact on machine learning models. few papers on the topic include

Faizy
  • 1,074
  • 1
  • 6
  • 30