1

The ready-to-use DNNClassifier in tf.estimator seems not able to fit these data:

X = [[1,2], [1,12], [1,17], [9,33], [48,49], [48,50]]
Y = [ 1,     1,      1,      1,      2,       3     ]

I've tried with 4 layers but it's fitting to 83% (=5/6 sampes) only:

hidden_units = [2000,1000,500,100]
n_classes    = 4   

The sample data above are supposed to be separated by 2 lines (right-click image to open in new tab):

enter image description here

It seems stuck be cause of Y=2 and Y=3 are too close. How to change the DNNClassifier to fit to 100%?

Dee
  • 1,283
  • 1
  • 11
  • 35

1 Answers1

2

Normalise your inputs.

Neural networks work poorly outside of relatively small numerical ranges on input. An ideal range is for each feature to be drawn from $\mathcal{N}(0,1)$ i.e. a Normal distribution with mean $0$ and standard deviation $1$. In your case, divide both parts of $\mathbf{x}$ by $25$ and subtract $1$ would probably suffice.

Your neural network architecture is completely overblown for the problem at hand. That may be because you were trying to force it to fit this data (and failing because of lack of normalisation). Try something more like: hidden_units = [20,10]

Neil Slater
  • 28,678
  • 3
  • 38
  • 60
  • my inputs are now divided by 50 to be in range [0,1], but accuracy is still 83% (=5/6) – Dee Aug 23 '19 at 08:27
  • any other changes that i should apply? – Dee Aug 23 '19 at 08:29
  • 1
    @datdinhquoc: I'll take a deeper look. You probably need to play with other hyperparameters, such as learning rate, in order to get the classifier to split class 2 and 3 given just that one example. It may be an interesting toy example because of that, although typically you would not use a NN on real data with only 6 rows and 4 classes (one of which is not even in the data set) – Neil Slater Aug 23 '19 at 08:30
  • 1
    @datdinhquoc: I can get 100% accuracy (in Keras) reliably using a simple network as I suggested, and around 500 epochs. It did take a few attempts with different hyperparameters. Probably one thing worth knowing is that `tanh` activations on hidden layers works better than `relu` for smaller networks. However, `relu` will work, you just need to make the NN slightly larger. – Neil Slater Aug 23 '19 at 10:17
  • tks, i'll try switching the code to keras, may i have your 100%ac code? maybe in google colab? – Dee Aug 23 '19 at 10:24
  • 1
    @datdinhquoc OK, here is a gist: https://gist.github.com/neilslater/6053e3729045e575f40f4d884f4d65dc - it is just hacky code. I replicated your data as train, cv and test - it is important to not do that for real data if you want to test whether your NN has generalised correctly. Instead you should split your dataset properly into separate datasets. If you want to know more about that. please research it or ask a new question on the site – Neil Slater Aug 23 '19 at 17:14
  • tks for the gist – Dee Aug 25 '19 at 08:07