Questions tagged [variance-reduction]

For questions about variance reduction techniques in the context of artificial intelligence, in particular, sampling or Monte Carlo methods. An example of a variance reduction technique is Flipout, which was proposed in the paper "Flipout: Efficient Pseudo-Independent Weight Perturbations on Mini-Batches" (2018) by Yeming Wen et al.

4 questions

votes

3 answers

Why does is make sense to normalize rewards per episode in reinforcement learning?

In Open AI's actor-critic and in Open AI's REINFORCE, the rewards are being normalized like so rewards = (rewards - rewards.mean()) / (rewards.std() + eps) on every episode individually. This is probably the baseline reduction, but I'm not entirely…

asked Jan 24 '19 at 13:56

Gulzar

vote

0 answers

Can PPO be applied when the environment is "input-driven"?

I'm reimplementing an RL paper about learning a job scheduling policy that acts so as to minimize average job completion time. They claim that this is an "input-driven" problem, i.e. much of the variance in rewards is due to the randomness in job…

deep-rl variance-reduction

asked Feb 15 '23 at 21:44

Archie Gertsman

vote

0 answers

What strategies are there to reduce the variance of the policy gradient estimator of the REINFORCE algorithm?

What strategies are there to reduce the variance of the policy gradient estimator of the REINFORCE algorithm? I know one possibility is to subtract a baseline as a running average of rewards from past mini-batches. Another is to compute the mean and…

reinforcement-learning algorithm-request reinforce variance-reduction

asked Nov 24 '22 at 00:57

postnubilaphoebus

votes

1 answer

How to reduce variance in F1 scores of GAT across multiple runs while using PU Loss?

I am training GAT using a custom loss function(PU Loss) on the Cora and Citeseer dataset. My training file looks like f1_scores = [] N_ITER = 10 seeds = np.random.randint(1000, size=N_ITER) for i in range(N_ITER): seed_value = seeds[i] …

graph-neural-networks variance-reduction

asked Jan 22 '23 at 16:09

willtryagain