Adjustments
There are a few considerations that may help with problem analysis:
- Traffic data alone will produce limited predictive results. For instance, the onset of precipitation is critical input for any production-ready predictive model.
- Traffic congestion is highly chaotic in that events cascade, thus what is colloquially called the butterfly effect, is a dominant challenge. What you can realistically achieve are what chaoticians call attractors. They are morphological features in phase spaces and auto-correlations that are predictable under specific conditions.
- Modelling traffic requires further segmentation beyond stop locations to include turn options and merge points, both of which are critical factors in free flow and congestion.
- Feedback from the field will DRASTICALLY improve prediction. Expecting the system to predict start to finish scenarios longer than a few minutes of potential congestion dynamics is unrealistic. Inputs will need to be continuous and the closer to real time the better.
- The physical relationships between position, velocity, and acceleration must be built into the parameterized model in which parameters are continuously updated.
We can assume that the system design predicts congestion for some purpose, which was not stated in the question. It is reasonable to assume, because of the economics of transportation, that the objective is to reduce the area under the probability distribution curve for undesirable events resulting from congestion.
- Lost time
- Accident occurrence
- Fatalities
The mechanism of control through which such objectives could be achieved can be any of these.
- Notifying drivers of traffic conditions via cell towers or WiFi
- Notifying drivers via dynamic sign display
- Adjusting remotely configurable traffic signaling devices
- Controlling gates
These controls can be to human drivers or pilots or to automated driving or piloting systems. The context of the system is key to designing it properly, since the use cases are entirely driven by this context.
The Questions
- Is this approach correct, [and] how can I improve it? — suggestions above
- I am not sure if the number of inputs can be increased for better results. — absolutely yes
- Which activation method will be best to utilize to get a range of conditions/event types rather than binary 1 or 0? — A ReLU or one of its derivatives may produce the best results for off line training, however it may be advisable to look into real time approaches such as Q-learning. In either case, it is advisable to finish requirements definition, decide upon approach, and produce a general architectural diagram of the system before attending to such minutia as activation functions.
- [Given the likely data set size of 200GB] can I use my professional grad laptop to process this data or if I should consider going to Big Data Processing power? — Using VLSI hardware acceleration from vendors such as Intel or NVidia is advisable. Over the course of time, the initial expense and learning curve may be absorbed by not having to pay for additional bandwidth for PasS or SaaS services.
Regarding computing resources, it is inadvisable to begin with some of the most congested places.
Cities Definitely Too Congested
It is likely that these cities would require a super-computing platform of significant size and a multinational corporation sized budget.
- Beijing
- Dubai
- Tokyo
- Los Angeles
- Chicago
- London
- Hong Kong
- Shanghai
- Paris
- Amsterdam
- Dallas
Commuter locations such as Melbourne, Australia; Palm Beach, Florida; or New Haven, Connecticut are examples of good choices to start.