4

In training a neural network, you often see the curve showing how fast the neural network is getting better. It usually grows very fast then slows down to almost horizontal.

Is there a mathematical formula that matches these curves?

enter image description here

Some similar curves are:

$$y=1-e^{-x}$$

$$y=\frac{x}{1+x}$$

$$y=\tanh(x)$$

$$y=1+x-\sqrt{1+x^2}$$

Is there a theoretical reason for this shape?

zooby
  • 2,196
  • 1
  • 11
  • 21
  • 1
    As you mention, "It **usually** grows very fast then slows down", so, of course, there isn't a single mathematical formula that would describe all learning curves. So, I suggest you clarify your question, which seems to be quite interesting. – nbro Jan 09 '20 at 15:36
  • @nbro OK. By usually I mean always. And of course it slows down because the neural network can't keep increasing intelligence forever. But these curves do almost always look the same and do usually keep growing for a long time but very slowly. I'm just wondering if one can theortically model this curve and produce a formula for it. – zooby Jan 09 '20 at 19:08
  • 1
    Someone changed it to "how fast the nerual network is converging". This is not what the curves show. The curves might show, for example, how the neural network improves in its game of chess. – zooby Jan 09 '20 at 19:11
  • Your question assumed the NN is working fine. Usage of RNN in CNN tasks might lead to a decreasing curve, or increasing the learning rate, or maybe any hyperparameters. –  Jan 09 '20 at 19:21
  • I had changed that sentence because you had written something like "network is gaining intelligence", which is a little bit arguable. Maybe "convergence" isn't the most appropriate word and I didn't think much about it. Anyway, learning curves look a bit like logarithmic curves. – nbro Jan 09 '20 at 19:24
  • 1
    I don't think there is a universal mathematical formula that describes the learning curve, but what I know is that not all curves look like your diagram. In Q-learning there is almost no overfitting so the curve continue to rise but in other task there maybe overfitting so it may rise then fall slowly again. Maybe one could fit a function to for the curve? – Clement Jan 11 '20 at 04:59

0 Answers0