I was reading AlexNet paper and the authors quoted
the kernels on one GPU were "largely color agnostic," whereas the kernels on the other GPU were largely "color-specific."
The upper GPU takes operates on filters on the top and lower GPU deals with the lower half. But what is the reason for each of them learning a different set of features, i.e. the top half of kernels learning the edges mostly and the bottom kernels learning color variation? Is there any reason behind it?