How do I normalise/un-normalise data when loading a model?

Question

I am following this TensorFlow JS tutorial where you load car data. The data looks like this:

[{x:100, y:20}, {x:80, y:33}]

X is the horsepower of a car, Y is the expected miles per gallon usage. After creating the model I save it locally using:

async function saveModel(){
    await model.save('downloads://cars-model');
}

Next, I load the model in a separate project, to make predictions without needing the original data.

NEW PROJECT

async function app(){
    let model = await tf.loadLayersModel('./cars-model.json');
    console.log("car model is loaded!");
}

I expect to be able to run predict here, on a single number (say, 120)

model.predict(tf.tensor2d([120], [1, 1]))

QUESTION

I think the number 120 needs to be normalised to a number between 0-1, just like the training data was. But how do I know the inputMin, inputMax, labelMin, labelMax values from the loaded model?

To un-normalise the prediction (in this case 0.6) I also need those original values.

How do I normalise/un-normalise data when loading a model?

original prediction code uses label and input values from the original data

function testModel(model, inputData, normalizationData) {
    const { inputMax, inputMin, labelMin, labelMax } = normalizationData;

    // Generate predictions for a uniform range of numbers between 0 and 1;
    // We un-normalize the data by doing the inverse of the min-max scaling 
    // that we did earlier.
    const [xs, preds] = tf.tidy(() => {

        const xs = tf.linspace(0, 1, 100);
        const preds = model.predict(xs.reshape([100, 1]));

        const unNormXs = xs
            .mul(inputMax.sub(inputMin))
            .add(inputMin);

        const unNormPreds = preds
            .mul(labelMax.sub(labelMin))
            .add(labelMin);

        // Un-normalize the data
        return [unNormXs.dataSync(), unNormPreds.dataSync()];
    });


    const predictedPoints = Array.from(xs).map((val, i) => {
        return { x: val, y: preds[i] }
    });

}

Hi @Kokodoko did you work out a solution? I've hit the same road block, pulling in the data for the original min/max as a JSON but its giving me troubles. I have been googling for days but can't find any tutorials that load separate and this is the only reference to it I can see? weird eh? How is everyone else doing it :) — laurencefay, Sep 16 '20 at 13:48
@laurencefay According to the answer below you just have to save the original min/max values in a separate JSON file. Then you can load that JSON file later to retrieve those values and un-normalize the results — Kokodoko, Sep 22 '20 at 07:56

score 1 · Accepted Answer · answered Apr 01 '19 at 13:02

1

How do I convert this to a number between 0-1 without having access to the original car data?

You save the normalisation parameters (typically an offset and a multiplier for each column), and consider that part of the model. Typically you do this when you originally scale training data.

When you want to re-use the model, as well as loading the neural network architecture and weights, you need to load the normalisation parameters in order to re-use those too and scale your inputs.

When tutorials present a self-contained neural network that loads training data, builds a model, then tests that model, all in the same process, then often this step is not shown. However, saving the normalisation data is important, and basically should be considered part of the model, even though it is not directly part of the neural network parameters or hyper-parameters.

answered Apr 01 '19 at 13:02

Neil Slater

28,678
3
38
60

OK, thanks for clearing that up. So do I save that data manually, in a separate file, or should it be part of the saved model? – Kokodoko Apr 01 '19 at 13:08
@Kokodoko: It's up to you as the developer which you prefer, and may depend on your use case. For instance if you always use the same training data, scaling it the same way and want to move different NNs around for testing, then separate files will be slightly easier and quicker to develop. But a self-contained file with supporting data for the model might make more sense in other situations – Neil Slater Apr 01 '19 at 13:18
But if I use `model.save`, is there an option to include that data? Can I add it manually before saving? For example, `model.min = 100; model.max = 200`. – Kokodoko Apr 01 '19 at 14:55
@Kokodoko: I don't know, that depends on how function `save` is written – Neil Slater Apr 01 '19 at 14:56
I'm using the Tensorflow JS API: https://js.tensorflow.org/api/latest/#tf.LayersModel.save It seems to have a `config` option but this is not further explained... :( – Kokodoko Apr 01 '19 at 16:56
1

From the description it does not look like it will accept added data, so you will need to arrange to store the normalisation parameters separately – Neil Slater Apr 01 '19 at 17:14

How do I normalise/un-normalise data when loading a model?

1 Answers1