Coming from the YT videos of 3blue1brown which showed that the individual layers do not have discernible shapes in the case of hand written letter recognition, I wondered if you could penalize dispersed shapes while training, thus creating connected shapes (at least on the first layer in the beginning). That way, you may be better able to understand the propagation of your algorithm through the layers.
Thanks, Jonny