4

There is this video on pythonprogramming.net that trains a network on the MNIST handwriting dataset. At ~9:15, the author explains that the data should be normalized.

The normalization is done with

x_train = tf.keras.utils.normalize(x_train, axis=1)
x_test = tf.keras.utils.normalize(x_test, axis=1)

The explanation is that values in a range of 0 ... 1 make it easier for a network to learn. That might make sense, if we consider sigmoid functions, which would otherwise map almost all values to 1.

I could also understand that we want black to be pure black, so we want to adjust any offset in black values. Also, we want white to be pure white and potentially stretch the data to reach the upper limit.

However, I think the kind of normalization applied in this case is incorrect. The image before was:

Original image

After the normalization it is

Normalized image

As we can see, some pixels which were black before have become grey now. Columns with few black pixels before result in black pixels. Columns with many black pixels before result in lighter grey pixels.

This can be confirmed by applying the normalization on a different axis:

Normalization applied to a different axis

Now, rows with few black pixels before result in black pixels. Rows with many black pixels result in lighter grey pixels.

Is normalization used the right way in this tutorial? If so, why? If not, would my normalization be correct?

What I expected was a per pixel mapping from e.g. [3 ... 253] (RGB values) to [0.0 ... 1.0]. In Python code, I think this should do:

import numpy as np
import imageio
image = imageio.imread("sample.png")
image = (image - np.min(image))/np.ptp(image)
Thomas Weller
  • 221
  • 3
  • 9
  • This is one of my pet-peaves... Either using an inappropriate standardization/scaling/normalization or using the wrong word to describe what's been done. :). Good catch @thomas! – David Hoelzer Oct 24 '22 at 06:47
  • Yep, definitely a good catch. Probably worth posting a polite correction in the comments section of the video. – Snehal Patel Oct 24 '22 at 13:19

2 Answers2

1

The normalize() function used in the tutorial is the incorrect function to use for normalization in this context. This normalize function is for finding L1 and L2 norms. The correct layer/function to use for scaling between 0 and 1 would be Rescaling():

x_train_norm = Rescaling(scale=1./255, offset=0.0)(x_train)
Snehal Patel
  • 912
  • 1
  • 1
  • 25
0

You could rank every pixel in terms of brightness before normalization and after to verify that the ranks are preserved. Correct normalization would preserve the ranks. From the pictures 1 and 2, it seems that the ranks were not preserved (e.g. the top right pixels went from grey to black).

The tutorial's normalization is done incorrectly: x_train is a an array of 2D images, but the normalization was applied along the first axis. What happened is that each column of every image was normalized relatively to itself (compare image 1 and 2 column-wise – the ranks were preserved). The normalization should have been applied along the second axis – that would normalize every pixel relative to the image it's in. In seems that incorrect normalization wasn't enough to sabotage the learning though! Your proposed normalization should work fine.

mark mark
  • 753
  • 4
  • 23