Why are most commonly used activation functions continuous?

Asked Jan 27 '21 at 16:56

Active Jan 27 '21 at 17:02

Viewed 196 times

I have come to notice that the most commonly used activation functions are continuous. Is there any specific reason behind this? Results such as this paper have worked on training networks with discontinuous activations yet this does not seem to have taken off. Does anybody have insight into why this happens, or better yet an article talking about this?

edited Jan 27 '21 at 17:02

nbro

39,006
12
98
176

asked Jan 27 '21 at 16:56

ABIM

1

[Here](https://ai.stackexchange.com/q/17609/2444) and [here](https://ai.stackexchange.com/q/19894/2444) are two related questions. – nbro Jan 28 '21 at 00:25
1

I had seen both before posting but none of them discusses why they are not wildly used or why most activations are continuous. – ABIM Jan 28 '21 at 13:58
2

Hi @BIM, check [this](https://stats.stackexchange.com/questions/271701/why-is-step-function-not-used-in-activation-functions-in-machine-learning) out, has some interesting thoughts. It's about step-function though. – mark mark Jan 28 '21 at 15:27

Why are most commonly used activation functions continuous?

0 Answers0