0

I am training GAT using a custom loss function(PU Loss) on the Cora and Citeseer dataset. My training file looks like

f1_scores = []  
N_ITER = 10
seeds = np.random.randint(1000, size=N_ITER) 

for i in range(N_ITER):
    seed_value = seeds[i]
    np.random.seed(seed_value)
    random.seed(None)
    torch.manual_seed(seed_value)
    model = GAT().to(device)
    # train it 
    # find f1 score
    f1_scores.append(f1)

print(np.mean(f1_scores))

When I run this file multiple times by doing

 for i in `seq 1 10`; do python train.py; done

I am getting high variance in the values (for e.g 0.43 and 0.76). I don't understand why this is happening even after taking the mean.

  1. Is this the right way to take the mean of the model's F1 scores?
  2. How to reduce this variance?

I have followed the steps mentioned here. I must use a NN. I increased the weight decay (L2) values without any success.

  • I don't know about PU-learning, but from glimpsing into the introduction of the paper you provided it sounds like this loss is designed for binary classification. Cora/Citeseer contain 6/7 classes and are not binary classification tasks. Did you account for this in your script? – Chillston Jan 22 '23 at 16:31
  • Yes, I chose one class as positive and the rest as negative, as described [here](https://arxiv.org/abs/2103.04683). – willtryagain Jan 22 '23 at 17:13
  • Hmm, okay then I have no clue but I also don't know about the PU training. Sadly, I currently have no time to read the paper you provide. Maybe you could give a brief overview on that method to make it less effort for the community to provide ideas. – Chillston Jan 25 '23 at 16:47

1 Answers1

0
  1. The seeds themselves were randomly sampled. So each time I run the script, I get a different value of the average F1 score. Now I am not sampling the seeds each time. I have fixed those values.

  2. The high variance is because of the model's sensitivity to the training data. In PU-Learning, we sparsely label(~1%) the positive data for training. Especially in the case of graph datasets, the model performance is sensitive to which nodes have been labelled.

So, variance is at least fixed for the given set of seeds. Now I get identical values each time I run the script.