I am training an undercomplete autoencoder network for feature selection. I am using one hidden layer in the encoder and decoder networks each. The ELU activation function is used for each layer. For optimization, I am using the ADAM optimizer. Moreover, to improve convergence, I have also introduced learning rate decay. The model shows good convergence initially, but later starts to generate losses (12 digits) in the same range of values for several epochs, and is not converging. How to solve this issue?
Asked
Active
Viewed 308 times
1 Answers
1
The trick was to normalize the input dataset values with the respective mean and standard deviation in each column. This reduced the loss drastically, and my network is training more efficiently now. Moreover, normalizing the data also helps you calculate the weights associated with each input node more easily, especially when trying to find out variable importance.

Prishita Ray
- 51
- 4