How come that the addition of features can decrease the performance of a neural network?

Question

I have a Remaining Useful Life (RUL) prediction problem that I want to solve. When I added two or more features as inputs to my ANN, the accuracy of my ANN has been decreased. More precisely, I've added features like RMS or KURTOSIS (or both). I was expecting the system to improve, but it is getting worse.

Why might this be happening? What are the potential reasons for this degradation in performance?

I know that when we added more nodes in layers (like hidden layers), overfitting can happen. Would that be related to my problem: using more than two features?

A simple case is if the new features noisy. – SpiderRico Mar 22 '20 at 20:14 — SpiderRico, Mar 22 '20 at 20:14

score 2 · Answer 1 · edited Mar 21 '20 at 00:53

Additional features can also cause overfitting if they have low or misleading information.

Consider the following problem:

$X = [1, 3, 3, 4, 5]$, $Y = [1, 3, 4, 4, 5]$.

Suppose that the real dataset was generated from the relationship:

$Y = X$, with a probability of 0.2 of adding or subtracting 1.

A reasonable model estimate is $Y = X$. Note that no model can fit this data perfectly, because the two 3 inputs map to different outputs.

Now, suppose we add a new feature: a random number between $0$ and $10$: $W = [1 ,5, 2, 6, 3]$

It may not be obvious, but a sufficiently deep and broad neural network can learn a new function:

$g(W) = 1$ if $W = 2,4,7,8,9,$ or $0$.

$g(W) = 0$ otherwise.

and define a new prediction: $Y = X - g(W)$.

This happens to produce a perfect fit on the training data. However, it will perform extremely poorly on new data (like a test set), because it has learned a meaningless pattern out of random noise. Coincidentally, it will be wrong on about 50% of samples, while our first model will be wrong on only 20% of samples.

@nbro No, I intended for there to be some error in the original set, but on second look, I need to add some extra data for this example to make sense. — John Doucette, Oct 22 '19 at 23:30

How come that the addition of features can decrease the performance of a neural network?

1 Answers1