I am currently working on a stock market prediction model which incorporates sentiments along with historical price for next day price prediction.
I wanted to test different window / sequence size e.g (3 days, 4 days .. 10 days) to identify which window size is most optimal in predicting the next day prices. However the selection for num_units in model.add(LSTM(units=num_units)) for different window sizes are varying.
If a smaller window size is paired with a larger num_unit, there is over-fitting in the data where the model prediction for the price at day t+1 is almost equal to the price at day t.
Hence I am unable to make a fair comparison between different window sizes without varying num_units
I have referred to this How to select number of hidden layers and number of memory cells in an LSTM? however am unable to come to a conclusion.
Is there a predefined guideline for the num_units to use within a LSTM cell for timeseries prediction based on the sequence length?