Joined vs Separate optimizer for Actor-Critic

Question

Say that I have a simple Actor-Critic architecture, (I am not familiar with Tensorflow, but) in Pytorch we need to specify the parameters when defining an optimizer (SGD, Adam, etc) and therefore we can define 2 separate optimizers for the Actor and the Critic and the backward process will be

actor_loss.backward()
actor_optimizer.step()
critic_loss.backward()
critic_optimizer.step()

or we can use a single optimizer for both the Actor's and the Critic's parameters so the backward process can be like

loss = actor_loss + critic_loss
loss.backward()
optimizer.step()

I have 2 questions regarding both approaches:

Is there any consideration (pros, cons) for both the single joined optimizer and the separate optimizer approach?
If I want to save the best Agent (Actor and Critic) periodically (based on a predefined testing environment), do I always have to update the Critic, regardless of the current Agent's performance? Because (CMIIW) the Critic is (in its most basic purpose) only for predicting the action-value or state-value thus a more trained Critic is better.

You're asking 2 distinct questions here. I would suggest that you ask the second question in a separate post. — nbro, Oct 17 '21 at 02:38

score 1 · Answer 1 · answered Feb 26 '22 at 14:36

I am also very curious about this. I have been implementing A2C in PyTorch from scratch and have tried both a single optimizer and separate optimizers. The separate case learns so much quicker. I believe it may have something to do with the coefficients, balancing the critic coefficient so the two losses were within a similar range seemed to work.

Joined vs Separate optimizer for Actor-Critic

1 Answers1