Questions tagged [bptt]

For questions about the backpropagation through time (BPTT) algorithm, which is often used to find the gradient of the objective function with respect to the parameters of a recurrent neural network (RNN) when training the RNN with gradient descent.

2 questions
3
votes
0 answers

What is the time complexity for training a gated recurrent unit (GRU) neural network using back-propagation through time?

Let us assume we have a GRU network containing $H$ layers to process a training dataset with $K$ tuples, $I$ features, and $H_i$ nodes in each layer. I have a pretty basic idea how the complexity of algorithms are calculated, however, with the…
0
votes
0 answers

In LSTMs, how does the additive property enables better balancing of gradient values during backpropagation?

There are two sources that I'm using to to try and understand why LSTMs reduce the likelihood of the vanishing gradient problem associated with RNNs. Both of these sources mention the reason LSTMs are able to reduce the likelihood of the vanishing…