I want to train a WGAN where the convolution layers in the critic are only allowed to have non-negative weights (for a technical reason). The biases, nonetheless, can take both +/- values. There is no constraint on the generator weights.
I did a toy experiment on MNIST and observed that the performance is significantly worse than a regular WGAN.
What could be the reason? Can you suggest some architectural modifications so that the nonnegativity constraint doesn't severely impair the model capacity?