5

I am new to Deep Learning.

Suppose that we have a neural network with one input layer, one output layer, and one hidden layer. Let's refer to the weights from input to hidden as $W$ and the weights from hidden to output as $V$. Suppose that we have initialized $W$ and $V$, and ran them through the neural network via the forward algorithm/pass. Suppose that we have updated $V$ via backpropagation.

When estimating the ideal weights for $W$, do we keep the weights $V$ constant when updating $W$ via gradient descent given we already calculated $V$, or do we allow $V$ to update along with $W$?

So, in the code, which I am trying to do from scratch, do we include $V$ in the for loop that will be used for gradient descent to find $W$? In other words, do we simply use the same $V$ for every iteration of gradient descent?

nbro
  • 39,006
  • 12
  • 98
  • 176
  • If you're interested in the details of backpropagation and general review about deep learning, you may take a look at the following technical report on arXiv: [Deep learning for pedestrians: backpropagation in CNNs](https://arxiv.org/abs/1811.11987) Make sure to download locally and open with Acrobat reader to enjoy the illustrations! – ranlot Apr 04 '19 at 15:00

1 Answers1

2

The answer is implied in the term "backpropagation". All gradients are calculated at the same time. That is, the error from your loss function is propagated backwards from the output and through your whole network. This propagation results in an error associated with each weight in the network, which determines how you change each weight in order to minimize your loss function.

or do we allow $V$ to update along with $W$?

Yes. This saves time, since many of the results of intermediate computations used to update $V$ can be reused in the update of $W$.

See http://neuralnetworksanddeeplearning.com for a detailed description of the backpropagation algorithm.

Philip Raeisghasem
  • 2,028
  • 9
  • 29