What kind of neural network architecture is suitable for variable length block-like time series data?

Question

I'm not sure what this type of data is called, so I will give an example of the type of data I am working with:

A city records its inflow and outflow of different types of vehicles every hour. More specifically, it records the engine size. The output would be the pollution level X hours after the recorded hourly interval.

It's worth noting that the data consists of individual vehicle engine size, so they cant be aggregated. This means the 2 input vectors (inflow and outflow) will be of variable length (different number of vehicles would be entering and lraving every hour) and I'm not sure how to handle this. I could aggregate and simply sum the number of vehicles, but I want to preserve any patterns in the data. E.g. perhaps there is a quick succession of several heavy motorbike engines, denoting a biker gang have just entered the city and are known to ride recklessly, contributing more to pollution than the sum of its parts.

Any insight is appreciated.

score 1 · Answer 1 · answered Sep 23 '19 at 12:45

(this response should be a comment but I don't have yet the reputation to comment).

If I'm understanding your problem correctly you have a variable number of input which have an order and only one output ? It look like the kind of task where you could use recurrent neural network (the most common ones are the LSTM and GRU).

If you use a recurrent neural network you could (if you have the timestamps of your data) cut the hourly interval into smaller time step to help detect pattern.

score 1 · Answer 2 · answered Sep 24 '19 at 07:13

I have come across the same issue but in language. Where each input was a sentence, hence of different lengths.

The easier solution is to just find the longest sequence, extract its length, and 0 pad all other values to be able to get all of them to the same size, and then use any recurrent neural network architecture (Since you're dealing with a time series), where these padded values act as input. (When 0 padding, it is best if fewer of your layers have a bias value, because you want all multiplication with 0's to just give 0's as they represent a pad, having bias would mean after the weight multiplies with 0 it adds a small bias value, which is undesired because it is irrelevant information, One can always assume backprop will learn 0's for the biases even if present, but it never actually does, they mostly learn extremely small values, which can also hinder results)

Another thing to experiment with is creating a small network, to create an embedding for each input. The padded input can be input to a small neural network which generates a fixed size embedding (Activations of a particular layer ; Backprop will learn a decent representation when trained end-to-end ) that is the input to the recurrent neural architecture. - Adding a network to create an embedding almost always helps, as long as kept simple. Experimenting with the embedding algorithm (network), can help you find an idealistic one, which can significantly boost your results.

What kind of neural network architecture is suitable for variable length block-like time series data?

2 Answers2