Questions tagged [tanh]

Use this tag for questions related to "hyperbolic tangent activation functions" (tanh) used in neural networks.

5 questions
3
votes
1 answer

Why is tanh a "smoothly" differentiable function?

The sigmoid, tanh, and ReLU are popular and useful activation functions in the literature. The following excerpt taken from p4 of Neural Networks and Neural Language Models says that tanh has a couple of interesting properties. For example, the…
hanugm
  • 3,571
  • 3
  • 18
  • 50
2
votes
1 answer

Why and when do we use ReLU over tanh activation function?

I was reading LeCun Efficient Backprop and the author repeated stressed the importance of average the input patterns at 0 and thus justified the usage of tanh sigmoid. But if tanh is good then how come ReLU is very popular in most NNs (which is even…
2
votes
3 answers

Why is there tanh(x)*sigmoid(x) in a LSTM cell?

CONTEXT I was wondering why there are sigmoid and tanh activation functions in an LSTM cell. My intuition was based on the flow of tanh(x)*sigmoid(x) and the derivative of tanh(x)*sigmoid(x) It seems to me that authors wanted to choose such a…
1
vote
0 answers

Could we add clipping in the output layer of the actor in DDPG?

I have a doubt about how clipping affects the training of the RL agents. In particular, I have come across a code for training DDPG agents, the pseudo-code is the following: 1 for i in training iterations 2 action = clip(ddpg.prediction(state)…
Leibniz
  • 69
  • 4
0
votes
0 answers

what is the idea behind gated-attention CNN

I have the below code for gated attention: class Attn_Net_Gated(nn.Module): # Attention Network with Sigmoid Gating (3 fc layers). Args: # L: input feature dimension # D: hidden layer dimension # dropout: whether to use dropout (p =…