1

For a neural network model that classifies images, is it better to use normalization (dividing by 255.0) or using standardization (subtract mean and divide by STD)?

When I started learning convolutional neural networks, I always used normalization because it's simple and effective, but then I started to learn PyTorch and in one of the tutorials https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html they preprocess images like this:

transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)

The transform object is created, which has the NORMALIZE parameter, which in itself has the mean and STD values for each channel.

At first, I didn't understand how this works, but then learned how standardization works from Andrew Ng's video, but I didn't find the answers to why is it better to use standardization over normalization or vice-versa. I understand that normalization scales inputs from [0, 1], and standardization first subtracts mean, so that dataset would be centered around 0, and divides everything by STD, so that it would normalize the variance.

Though I know how each of these techniques work (I think I know), I still don't understand why would anybody use one over the other to preprocess images.

Could anybody explain where and why would you use normalization or standardization (if possible could you give an example)? And as a side question: is it better to use the combined version where first you normalize the image and then standardize it?

nbro
  • 39,006
  • 12
  • 98
  • 176
artas2357
  • 153
  • 1
  • 7

0 Answers0