3

I'm currently in the middle of a project (for my thesis) constructing a deep neural network. Since I'm still in the research part, I'm trying to find various ways and techniques to initialize weights. Obviously, every way will be evaluated and we will choose the one that fits best with our data set and our desired outcome.

I'm all familiar with the Xavier initialization, the classic random one, the He initialization, and zeros. Searching through papers I came across the SCAWI one (Statistically Controlled Activation Weight Initialization). If you have used this approach, how efficient is it?

(Also, do you know any good sources to find more of these?)

nbro
  • 39,006
  • 12
  • 98
  • 176
ChrisP
  • 41
  • 4
  • My 2 cents: this is a method back from 1992, hence I would not expect that has anything meanigful to add in the today's context (heck, ReLU did not even exist back in 1992!). – desertnaut Oct 19 '20 at 13:03
  • 1
    @desertnaut Something like the ReLU (i.e. that sets to zero everything that is negative) was actually used back in the 80s. See e.g. equation 4 [neocognitron paper](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=6313076) by Fukushima et al.. Although that is not the ReLU, it's similar. Actually, figure 4 of that paper shows the definition of the ReLU. So, ReLU exists at least since the 1982 (and was used in the context of NNs), but the term "ReLU" or "rectifier" may have been introduced later and became standard as an activation function in neural networks only in the last years. – nbro Oct 22 '20 at 09:40
  • @nbro thank you for this information. But the spirit of my comment was that rectifier activations were far from standard back then; I am old enough to have used NNs for my BSc thesis in 1991-92, and I cannot recall them being part of the standard toolboxes or introductory exhibitions. On the other hand, the [He initialization](https://arxiv.org/abs/1502.01852) is built exactly upon the premise of such rectifier activations. (1/2) – desertnaut Oct 22 '20 at 10:59
  • @nbro All in all, although assessing the SCAWI scheme in today's context could be a topic of research, worrying about it from a practitioner's viewpoint (such as OP's) sounds like overkill. (2/2) – desertnaut Oct 22 '20 at 10:59

0 Answers0