4

The following X-shape alternated pattern can be separated quite well and super fast by K-nearest Neighbour algorithm (go to https://ml-playground.com to test it):

image of x-shaped data

However, DNN seems to face great struggles to separate that X-shape alternated data. Is it possible to do K-nearest before DNN, ie. set the DNN weights somehow to simulate the result of K-nearest before doing DNN training?

Another place to test the X-shape alternated data: https://cs.stanford.edu/people/karpathy/convnetjs/demo/classify2d.html

Dee
  • 1,283
  • 1
  • 11
  • 35
  • 1
    If data already labeled clustering wouldn't add anything new. The only known way to set weights for DNN is training form random initialization. The convergence of training is based on randomness of weights (it's used in all existing proofs). It could be taht in future there will be non-random way to initialize weights, but those method necessarily will be very complex and putting additional constraint on them probably would be extremely hard. Anyway there is no such non-random methods for now. – mirror2image Sep 26 '19 at 05:56
  • yeah, but if we can just initialise weights in some greedy manner – Dee Sep 26 '19 at 06:04

1 Answers1

2

There are two factors that will change the ability of a deep neural network to fit a given dataset: either you need more data, or a deeper and wider network. Since the pattern is only 2-d, it can likely be approximated by some sort of simple periodic function. A DNN can approximate periodic functions pretty well, so the issue is probably that you don't have enough data.

If you have an apriori belief that the pattern is well approximated by K-nearest neighbors, then you could do the following:

  1. Fit a K-NN model to the data.
  2. Generate $N$ new points uniformly at random from the input space.
  3. Label the $N$ new points using the K-NN model.
  4. Fit your DNN to the original dataset, plus the $N$ new points.
John Doucette
  • 9,147
  • 1
  • 17
  • 52