While working through some example from Github I've found this network (it's for FashionMNIST but it doesn't really matter).
Pytorch forward method (my query in upper case comments with regards to applying Softmax on top of Relu?):
def forward(self, x):
# two conv/relu + pool layers
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
# prep for linear layer
# flatten the inputs into a vector
x = x.view(x.size(0), -1)
# DOES IT MAKE SENSE TO APPLY RELU HERE
**x = F.relu(self.fc1(x))
# AND THEN Softmax on top of it ?
x = F.log_softmax(x, dim=1)**
# final output
return x