1

I have trained a classification network with PyTorch lightning where my training step looks like below:

def training_step(self, batch, batch_idx):
    x, y = batch
    y_hat = self(x)
    loss = F.cross_entropy(y_hat, y)
    self.log("train_loss", loss, on_epoch=True)
    return

When I look at the output logits, almost all of them are very large negative numbers, with one that is usually 0. Is this normal or might be something wrong with my training?

I am just using nn.LogSoftmax() on the outputs and taking the max to make my predictions, but my network is not doing so good when I am running on unseen data, and I want to make sure the problem is just me overfitting.

nbro
  • 39,006
  • 12
  • 98
  • 176
pd109
  • 125
  • 4
  • Kind of yes... It's log of softmax. Now softmax will always be < 1 and > 0. Apart from the top class, rest would be close to 0. So log of that will be large negative number. – user1953366 Apr 28 '22 at 18:40

1 Answers1

1

Sounds like it worked to me.

nn.LogSoftmax returns the log of the softmax (duh). The outputs from softmax add up to 1, and form a probability distribution.

0 is the log of 1, meaning that class was predicted at a level of nearly 100%. And the other classes with large negative logs are a rounding error.

user253751
  • 922
  • 3
  • 11