4

There is plenty of literature describing LSTMs in a lot of detail and how to use them for multi-variate or uni-variate forecasting problems. What I couldn't find though, is any papers or discussions describing time series forecasting where we have correlated forecast data.

An example is best to describe what I mean. Say I wanted to predict number of people at a beach for the next 24 hours and I want to predict this in hourly granularity. This quantity of people would clearly depend on the past quantity of people at the beach as well as the weather. Now I can make an LSTM architecture of some sort to predict these future quantities based upon what happened in the past quite easily. But what if I now have access to weather forecasts for the next 24 hours too? (and historical forecast data too).

The architecture I came up with looks like this:

Forecast architecture

So I train the left upper branch on forecast data, then train the right upper branch on out-turn data, then freeze their layers to and join them to form the final network in the picture and train that on both forecasts and out-turns. (when I say train, the output is always the forecast for the next 24 hours). This method does in fact have better performance than using forecasts or out-turns alone.

I guess my question is, has anyone seen any literature around on this topic and/ or knows a better way to solve these sort of multivariate time series forecasting problems and is my method okay or completely flawed?

nbro
  • 39,006
  • 12
  • 98
  • 176

1 Answers1

1

I have not come across a model like this yet. BUT

If you have not tried smaller models I'd recommend trying that first.

Justification: this lets you use learning curves to diagnose what to do next.

Also, you might try starting with a GRU (the LSTM overhead may not be needed).

One Idea for a starting model

Observe that turnout does not affect weather. So the weather forecast model will be independent of the turnout model but not vice versa.

(Note: I'm using RNN to denote which ever recurrent architecture you choose)

Formulation for simultaneous prediction:

Weather

$\text{RNN}_w$ is your weather prediction model

$\hat w_t = \text{RNN}_w(\hat w_{t-1})$ be the predicted weather at time $t$

Turnout: The idea here is that people will or will not go to the beach for various reasons (overcrowded, too hot or stormy etc). So in the beach population prediction task we use all these as features to predict the population at the next time-step. This reduces the problem to any one of the classical models already developed.

$\text{RNN}_p$ is your turnout (beach population) prediction model

$\hat p_t$ be the predicted turnout at time $t$

$\hat c_t=[\hat p_t, \hat w_t]$ is the concatenation at time t

$\hat p_{t+1} = \text{RNN}_p(\hat c_t)$

Formulation in the presence of weather forecast:

Simply replace $\hat w_t$ with the true weather forecasts $\hat f_t$.

A final warning

Unless you think your RNN is better at weather prediction than the forecasting systems I would not recommend using $\text{RNN}_w$ in a production application.

I hope this helps.

respectful
  • 1,096
  • 9
  • 26