In LSTMs, how does the additive property enables better balancing of gradient values during backpropagation?

Asked Oct 15 '20 at 15:07

Active Oct 15 '20 at 15:07

Viewed 62 times

There are two sources that I'm using to to try and understand why LSTMs reduce the likelihood of the vanishing gradient problem associated with RNNs.

Both of these sources mention the reason LSTMs are able to reduce the likelihood of the vanishing gradient problem is because

I understand (1), but I don't understand what (2) means.

Any insight would greatly be appreciated!

asked Oct 15 '20 at 15:07

THAT_AI_GUY

can you please mention the slide number, where the point 2 is mentioned? – Swakshar Deb Oct 16 '20 at 19:00
119. It's mentioned in the original post. – THAT_AI_GUY Oct 16 '20 at 19:43
In order to vanish the gradient, the value should be zero for every path in the LSTM. If any path have the nonzero gradient the gradient will not be vanished. I think thats what they are trying to say. – Swakshar Deb Oct 16 '20 at 21:12

0 Answers0