7

I have a Remaining Useful Life (RUL) prediction problem that I want to solve. When I added two or more features as inputs to my ANN, the accuracy of my ANN has been decreased. More precisely, I've added features like RMS or KURTOSIS (or both). I was expecting the system to improve, but it is getting worse.

Why might this be happening? What are the potential reasons for this degradation in performance?

I know that when we added more nodes in layers (like hidden layers), overfitting can happen. Would that be related to my problem: using more than two features?

nbro
  • 39,006
  • 12
  • 98
  • 176
Aref.a
  • 79
  • 3

1 Answers1

2

Additional features can also cause overfitting if they have low or misleading information.

Consider the following problem:

$X = [1, 3, 3, 4, 5]$, $Y = [1, 3, 4, 4, 5]$.

Suppose that the real dataset was generated from the relationship:

$Y = X$, with a probability of 0.2 of adding or subtracting 1.

A reasonable model estimate is $Y = X$. Note that no model can fit this data perfectly, because the two 3 inputs map to different outputs.

Now, suppose we add a new feature: a random number between $0$ and $10$: $W = [1 ,5, 2, 6, 3]$

It may not be obvious, but a sufficiently deep and broad neural network can learn a new function:

$g(W) = 1$ if $W = 2,4,7,8,9,$ or $0$.

$g(W) = 0$ otherwise.

and define a new prediction: $Y = X - g(W)$.

This happens to produce a perfect fit on the training data. However, it will perform extremely poorly on new data (like a test set), because it has learned a meaningless pattern out of random noise. Coincidentally, it will be wrong on about 50% of samples, while our first model will be wrong on only 20% of samples.

John Doucette
  • 9,147
  • 1
  • 17
  • 52