I have trained a classification network with PyTorch lightning where my training step looks like below:
def training_step(self, batch, batch_idx):
x, y = batch
y_hat = self(x)
loss = F.cross_entropy(y_hat, y)
self.log("train_loss", loss, on_epoch=True)
return
When I look at the output logits, almost all of them are very large negative numbers, with one that is usually 0. Is this normal or might be something wrong with my training?
I am just using nn.LogSoftmax()
on the outputs and taking the max to make my predictions, but my network is not doing so good when I am running on unseen data, and I want to make sure the problem is just me overfitting.