Introduction
Traffic prediction is crucial to many applications including traffic network planning, route guidance, and congestion avoidance. We have tried to minimize the time required for a vehicle to go from point A to point B, and maximize the efficiency of the flow of traffic, to help the traffic police in managing traffic. Several essential factors affect traffic prediction:
- Geographical factors such as topology, etc.
- Social factors such as holidays, concert, weekends, etc.
- Limited Dataset, i.e., either small or not a publicly available dataset.
Why is the problem statement important?
The number of vehicles on the road in India have increased 2-fold in every 8 years since the year 2000. Apart from not having adequately constructed roads, there is no proper system for helping traffic police officers in controlling the flow of traffic. Usually, a traffic police officer has no idea which lane has how much traffic. Also, emergency vehicles get stuck in traffic, which may delay their working.
Data Collection
The number of lanes a road has, whether it’s a single or a two-way street, it’s speed limit and available width determines how much traffic can flow through it efficiently before congestion occurs. Several other parameters which identified a certain road segment were its start and end GPS coordinates and it’s length. The traffic followed certain patterns, usually being maximum an hour or two before lunch in the morning, and in the early evening which represented the daily commute of the majority working class population. Traffic was also high during some festivals or events. To capture such dependencies, we collected data in the form of the average speed of a vehicle over each road segment in 15-minute intervals.
We used a large-scale traffic prediction dataset- Q traffic dataset, which consists of three sub-datasets: query sub-dataset, traffic speed sub-dataset and road network sub-dataset. We only used the traffic and road network sub-datasets. The attributes of the road segments in the road network sub-dataset are shown in the figure below. There are three kinds of auxiliary domains in the Q-Traffic dataset:
- Geographical and social attributes include peak holidays, peak-hour, speed;
- The road intersection information such as local road network and junctions;
- Online crowd queries which record map search queries from users;
Average Speed of Road Segment with Time |
Historical Data for traffic at -24h, -48h, etc. |
Topology of Road Network |
Method
Let m be the number of roads, t be the number of consecutive days (i.e., the respective periods) in which the detector data are collected, τ be the number of time intervals (e.g. of 15 min length) wherein each day is partitioned.
After PCA, we can find r non-negligible singular values, X thus effectively resides on an r-dimensional subspace of Rp.
Now SVR is used to make the prediction, based on the attributes obtained from PCA above. Dependencies(nearby road segments) of a road segment are all weighted with their length before making the final feature vector and feeding it to SVR.
Result
One-day traffic speed sequence is used as inputs to predict future 2-hour traffic speed. The mean absolute percentage error (MAPE) is used to evaluate the performance for comparisons, which is defined as:Here, vt and v ̃t are the actual and predicted traffic speeds at time t. Then, depending on the expected traffic and the live traffic data, we control the traffic lights to ensure efficient traffic flow. The results for SVR are given in the figure below. We got an overall MAPE score of 9.73.
MAPE score with Time for SVR |
Comments
Post a Comment