Problem setting
We have to do a binary classification of data given a training dataset $D$, where most items belong to class $A$ and some items belong to class $B$, so the classes are heavily imbalanced.
Approach
We wanted to use a GAN to produce more samples of class $B$, so that our final classification model has a nearly balanced set to train.
Problem
Let's say that the data from both classes $A$ and $B$ are very similar. Given that we want to produce synthetic data with class $B$ with the GAN, we feed real $B$ samples that we have into the discriminator alongside with generated samples. However, $A$ and $B$ are similar. It could happen that the generator produces an item $x$, that would naturally belong to class $A$. But since the discriminator has never seen class-$A$ items before and both classes are very close, the discriminator could say that this item $x$ is part of the original data that was fed into the discriminator. So, the generator successfully fooled the discriminator in believing that an item $x$ is part of the original data of class $B$, while $x$ is actually part of class $A$.
If the GAN keeps producing items like this, the produced data is useless, since it would add heavy noise to the original data, if combined.
At the same time, let's say before we start training the generator, we show the discriminator our classes $A$ and $B$ samples while giving information, that the class-$A$ items are not part of class $B$ (through backprop). The discriminator would learn to reject class-$A$ items that are fed to it. But wouldn't this mean that the discriminator has just become the classification model we wanted to build in the first place to distinguish between class $A$ and class $B$?
Do you know any solution to the above-stated problem or can you refer to some paper/other posts on this?