Questions tagged [sigmoid]

For questions about the sigmoid functions (in particular, the logistic functions) and the consequences of using them as activation functions in neural networks.

34 questions
10
votes
3 answers

Are ReLUs incapable of solving certain problems?

Background I've been interested in and reading about neural networks for several years, but I haven't gotten around to testing them out until recently. Both for fun and to increase my understanding, I tried to write a class library from scratch in…
9
votes
1 answer

What happens when I mix activation functions?

There are several activation functions, such as ReLU, sigmoid or $\tanh$. What happens when I mix activation functions? I recently found that Google has developed Swish activation function which is (x*sigmoid). By altering activation function can it…
7
votes
1 answer

How is division by zero avoided when implementing back-propagation for a neural network with sigmoid at the output neuron?

I am building a neural network for which I am using the sigmoid function as the activation function for the single output neuron at the end. Since the sigmoid function is known to take any number and return a value between 0 and 1, this is causing…
7
votes
4 answers

What does "e" do in the Sigmoid Activation Function?

Within the Sigmoid Squishification function, f(x) = 1/(1 + e^(-x)) "e" is unnecessary, as it can be replaced by any other value that is not 0 or 1. Why is "e" used here? As shown below, the function is working well without that, and in replacement,…
Jake
  • 81
  • 4
4
votes
1 answer

Why is it a problem if the outputs of an activation function are not zero-centered?

In this lecture, the professor says that one problem with the sigmoid function is that its outputs aren't zero-centered. Are the explanation provided by the professor regarding why this is bad is that the gradient of our loss w.r.t. the weights…
Daviiid
  • 563
  • 3
  • 15
4
votes
0 answers

Why does sigmoid saturation prevent signal flow through the neuron?

As per these slides on page 35: Sigmoids saturate and kill gradients. when the neuron's activation saturates at either tail of 0 or 1, the gradient at these regions is almost zero. the gradient and almost no signal will flow through the neuron…
4
votes
1 answer

Neural network doesn't seem to converge with ReLU but it does with Sigmoid?

I'm not really sure if this is the sort of question to ask on here, since it is less of a general question about AI and more about the coding of it, however I thought it wouldn't fit on stack overflow. I have been programming a multilayer perceptron…
4
votes
1 answer

Can neural networks with a sigmoid as the activation function of the output layer approximate continuous functions?

Neural networks are commonly used for classification tasks, in fact from this post it seems like that's where they shine brightest. However, when we want to classify using neural networks, we often have the output layer to take values in $[0,1]$;…
3
votes
1 answer

Accuracy dropped when I ran the program the second time

I was following a tutorial about Feed-Forward Networks and wrote this code for a simple FFN : class FirstFFNetwork: #intialize the parameters def __init__(self): self.w1 = np.random.randn() self.w2 = np.random.randn() self.w3 =…
2
votes
3 answers

Why is there tanh(x)*sigmoid(x) in a LSTM cell?

CONTEXT I was wondering why there are sigmoid and tanh activation functions in an LSTM cell. My intuition was based on the flow of tanh(x)*sigmoid(x) and the derivative of tanh(x)*sigmoid(x) It seems to me that authors wanted to choose such a…
2
votes
1 answer

How do sigmoid functions make it so that the prediction $\hat{y}$ indicates the probability that the observed value, $y$, is $1$?

I am currently studying the textbook Neural Networks and Deep Learning by Charu C. Aggarwal. Chapter 1.2.1.3 Choice of Activation and Loss Functions says the following: The choice of activation function is a critical part of neural network design.…
2
votes
0 answers

In the Binary Flower Pollination Algorithm (using the sigmoid function), is it possible that no feature is selected?

I'm trying to use the Binary Flower Pollination Algorithm (BFPA) for feature selection. In the BFPA, the sigmoid function is used to compute a binary vector that represents whether a feature is selected or not. Here are the relevant equations from…
2
votes
1 answer

How can I train a neural network for another input set, without losing the learning of the previous input set?

I read this tutorial about backpropagation. So using this backpropagation we are training the neural network repeatedly for one input set, say [2,4], until we reach 100% accuracy of getting 1 as output. And the neural network is adjusting its weight…
2
votes
1 answer

Why will the sigmoid function be 1 (and 0), if we use a fully connected layer that produces a big enough positive (or negative, respectively) output?

I am using a fully connected neural network that uses a sigmoid activation function. If we feed a big enough input, the sigmoid function will finally become 1 or 0. Is there any solution to avoid this? Will this lead to classical sigmoid problems…
1
vote
2 answers

Why do non-linear activation functions not require a specific non-linear relation between its inputs and outputs?

A linear activation function (or none at all) should only be used when the relation between input and output is linear. Why doesn't the same rule apply for other activation functions? For example, why doesn't sigmoid only work when the relation…
Mr. Eivind
  • 558
  • 4
  • 27
1
2 3