Multi-layer network only predicts linear trends

Question

I have made a neural network from scratch (in java), which is refusing to switch out of linear regression. I have pushed up the layer sizes (it now has 2 hidden layers, both with 5 neurons), and yet when given harsh sloping polynomials to train on, it still predicts values that follow a gradient, even though this returns high cost.

The network is working in that the predictions do somewhat follow the polynomial as best as a line could, but why wont it actually give me predictions that follow a polynomial like the one it trains on?

I have checked all aspects of training, SGD is working as it should, as is the cost function (MSE), and yet the network just isn't able find a way to minimise cost, it can't seem to break free of linear regression.

Please clarify your specific problem or provide additional details to highlight exactly what you need. As it's currently written, it's hard to tell exactly what you're asking. — Community, Sep 03 '22 at 09:26
Ah, currently I don't use any. And now I have just found a resource going through why I need to: https://stackoverflow.com/questions/51442459/why-is-relu-used-in-regression-with-neural-networks. Thank you. — Gamaray, Sep 04 '22 at 07:29

score 0 · Accepted Answer · answered Sep 07 '22 at 11:02

Neural network is basically a composition of matrix multiplications (linear combination) and non-linear activation. In classical libraries, when an activation function is not specified then everything stays linear (with the linear activation function y=x).

These non-linear activation functions have several properties : hyperbolic tangent and sigmoid are double saturated, often used in classification task or combined with Normalization layers.

Cost function has nothing to do with non-linearity, it is just a way to measure how far we currently are from the goal.

Multi-layer network only predicts linear trends

1 Answers1