Can someone explain why training CNN model (in my case DenseNet201) on the same data, and the same data processing pipeline can be slower on better GPU (RTX3090) than worse one (RTX3060), with the same other parameters (exactly same PC just with new GPU)?
In both cases I used same batch size and other settings. The only way to make training faster on 3090 was to actually increase the batch size, which was too big for 3060. But I still don't understand why the same training params wouldn't produce better results.
Even though big part of the training is reading data from disk and data augmentation (albumentations in my case) it's still the same setup, so even if GPU work is smaller part of one entire epoch, it still should be a bit faster, right?