2

Is there some work that investigates how to train neural networks s.t. they approximately yield the same weights regardless of in what order they are presented with training samples from the dataset? I am asking because I have seen other work that claims that resetting weights in later layers improves training and generalisation. This was explained as the result of an initial bias produced by seeing some parts of the data earlier than others.

0 Answers0