What does the notation $\nabla_\theta \mathcal{L}$ mean?

Question

Here's the general algorithm of maximum entropy inverse reinforcement learning.

This uses a gradient descent algorithm. The point that I do not understand is there is only a single gradient value $\nabla_\theta \mathcal{L}$, and it is used to update a vector of parameters. To me, it does not make sense because it is updating all elements of a vector with the same value $\nabla_\theta \mathcal{L}$. Can you explain the logic behind updating a vector with a single gradient?

score 4 · Answer 1 · edited Apr 01 '20 at 13:02

This is standard backpropagation. The gradient term you see is in fact a vector of partial derivatives where each element is the partial derivative of the log-likelihood with respect to each element of the parameter vector $\theta$. Therefore, it has the same dimensionality as $\theta$. Each element of the parameter vector is then updated with the respective term in the vector of partial derivatives, which are generally not the same.

What does the notation $\nabla_\theta \mathcal{L}$ mean?

1 Answers1