Skip to main content

Real-Time and Predictive Traffic Data Analysis

Introduction

Traffic prediction is crucial to many applications including traffic network planning, route guidance, and congestion avoidance. We have tried to minimize the time required for a vehicle to go from point A to point B, and maximize the efficiency of the flow of traffic, to help the traffic police in managing traffic. Several essential factors affect traffic prediction:
  • Geographical factors such as topology, etc.
  • Social factors such as holidays, concert, weekends, etc.
  • Limited Dataset, i.e., either small or not a publicly available dataset.
The primary aim of the project is to use historical and live traffic data to control the traffic lights for efficient traffic flow.

Why is the problem statement important?

The number of vehicles on the road in India have increased 2-fold in every 8 years since the year 2000. Apart from not having adequately constructed roads, there is no proper system for helping traffic police officers in controlling the flow of traffic. Usually, a traffic police officer has no idea which lane has how much traffic. Also, emergency vehicles get stuck in traffic, which may delay their working.

Data Collection


The number of lanes a road has, whether it’s a single or a two-way street, it’s speed limit and available width determines how much traffic can flow through it efficiently before congestion occurs. Several other parameters which identified a certain road segment were its start and end GPS coordinates and it’s length. The traffic followed certain patterns, usually being maximum an hour or two before lunch in the morning, and in the early evening which represented the daily commute of the majority working class population. Traffic was also high during some festivals or events. To capture such dependencies, we collected data in the form of the average speed of a vehicle over each road segment in 15-minute intervals.

Whenever an unusually high number of online map queries for a particular destination preceded an event supposed to happen nearby, we could predict possible congestion. Thus, there was a negative correlation between the people searching for a certain destination and traffic speed in the surrounding areas.

We used a large-scale traffic prediction dataset- Q traffic dataset, which consists of three sub-datasets: query sub-dataset, traffic speed sub-dataset and road network sub-dataset. We only used the traffic and road network sub-datasets. The attributes of the road segments in the road network sub-dataset are shown in the figure below. There are three kinds of auxiliary domains in the Q-Traffic dataset:
  • Geographical and social attributes include peak holidays, peak-hour, speed;
  • The road intersection information such as local road network and junctions;
  • Online crowd queries which record map search queries from users;

Average Speed of Road Segment with Time

Historical Data for traffic at -24h, -48h, etc.

Topology of Road Network

Method

Let m be the number of roads, t be the number of consecutive days (i.e., the respective periods) in which the detector data are collected, τ be the number of time intervals (e.g. of 15 min length) wherein each day is partitioned.
X is a matrix defined by p = t×τ  rows and m columns. Thus, each column i of matrix X represents the flow of the i-th road segment, and each row j denotes the average speed at that particular point in time(j).
After PCA, we can find r non-negligible singular values, X thus effectively resides on an r-dimensional subspace of Rp.
Now SVR is used to make the prediction, based on the attributes obtained from PCA above. Dependencies(nearby road segments) of a road segment are all weighted with their length before making the final feature vector and feeding it to SVR.

Result

One-day traffic speed sequence is used as inputs to predict future 2-hour traffic speed. The mean absolute percentage error (MAPE) is used to evaluate the performance for comparisons, which is defined as:

Here, vt and v ̃are the actual and predicted traffic speeds at time t. Then, depending on the expected traffic and the live traffic data, we control the traffic lights to ensure efficient traffic flow. The results for SVR are given in the figure below. We got an overall MAPE score of 9.73.

MAPE score with Time for SVR


Conclusion

We integrate spatial data, historical traffic data with real-time traffic data to predict the flow of traffic on each row segment and to identify areas where congestion might occur. By determining future traffic flow, we can decrease the response time of emergency vehicles, and ensure that they are not caught in a traffic jam.

Future Possible Work

Convolutional Neural Networks can be applied to include the spatial dependencies within road intersections. Online Map Queries is inversely correlated with the traffic flow of the nearby area. The number of queries with a particular destination indicates a possible future event which will lead to traffic congestion in the neighborhood. We could have created a hybrid model using all three auxiliary domains including traffic flow, spatial relation, and query counts. Also, we would like to create a unique feature for emergency vehicles to allow free passage.

Comments

Popular posts from this blog

Human Trafficking dataset creation & analysis

Introduction The goal of this project is to create a Human Trafficking dataset from reliable sources such as news articles, Government agencies, etc and analyse the pain points in this area. Motivation   What is human trafficking? Human trafficking involves recruitment, harbouring or transporting people into a situation of exploitation through the use of violence, deception or coercion and forced to work against their will. In other words, trafficking is a process of enslaving people, coercing them into a situation with no way out, and exploiting them. What is it important?   Did you know that in 2015 alone, Human Trafficking generated $150 billion, more revenue  than Google, Nike, The NFL and Starbucks combined ?!?!   Sounds crazy right? Well there is more to this story than you know, that's why 18th of October is the EU Anti-Trafficking Day.According to a September 2017 report from the International Labor Organization (ILO) and Walk Free Fo...

InstaBully

Introduction Cyber bullying has become prevalent in today's social media driven world. Awareness about it however, is not very widespread. Given that there is usually no escape for cyber bullying victims from their bullies, it is even more devastating than traditional bullying. Sometimes it is also hard to distinguish between simple negative interactions and cyber-bullying. Keeping this in mind we wanted to create a program that would help detect cyber bullying on Instagram accounts given only a username. Relevance In India, nearly 40% of people have never heard of cyber-bullying. Furthermore a majority of people think that current cyber-bullying measures are insufficient. 45% of parents say that their children have been cyber-bullied. Out of all the various ways in which people can be bullied online social media is the most common and also the most personal.  Although the nature of the bullying changes from platform to platform the effect does not change. we picked...

Social Media and Policing

Social Media and Policing Traditionally, Police all over the world have utilised a one-way communication model, sending information to the public either directly or through news media and not receiving communications back. Social media tools are changing these communication models, creating possibilities for interpersonal, participatory, and interactive communications. Our project focuses on the use of the social media tool, Twitter , for the job of policing. We analysed the official Police handles of Mumbai, Bangalore, Delhi and Hyderabad on Twitter. The purpose of our analysis was to determine what type of information is shared by city police departments over Twitter and how the public uses the information shared to converse with the police departments and with each other. Data Collection We analysed 24,110 posts authored by the 4 city police departments and 2,31,589 posts of Twitter users who tagged these handles. The analysis showed that city police depar...