2

Feature scaling, in general, is an important stage in the data preprocessing pipeline.

Decision Tree and Random Forest algorithms, though, are scale-invariant - i.e. they work fine without feature scaling. Why is that?

stoic-santiago
  • 1,121
  • 5
  • 18
  • what problem are you trying to solve? this is a very generalized question. Are you trying to understand the limitations of the Random Forest Classifier and Regressor – Golden Lion Feb 15 '21 at 23:08

2 Answers2

3

Scaling only makes sense when there is something that reacts to that scale. Decision Trees though, just make a cut at a certain number.

Imagine: For a feature that goes from 0 to 100 a cut at 50 may be improving performance. Scaling this down to 0 to 1 making the cut a 0.5 doesn't change a thing.

Now on the other hand NN have some kind of activation function (leaving RELu aside) that react differently to input that is above 1. Here Normalization, putting every feature between 0 and 1 makes sense.

N. Kiefer
  • 311
  • 3
  • 9
2

Feature scaling happens to be a problem when a model is characterized by having a distance metric (or another kind of numerical evaluation for that matter). Therefore models such as support vector machines, neural networks, distance based clustering methods (e.g. k means) and linear/logistic regression are prone to changes by feature scaling.

Those which are based on probability rather than distances are not scale variant. These include Naive Bayes Classifiers, or decision trees.

Ghostpunk
  • 21
  • 6