Skip to main content

Traffic Violations in Metropolitan Cities

Introduction

With the advent of the smartphone era and the availability of 4G internet across the country, police forces have begun to use electronic receipts of the traditional traffic challans. E-Challans are electronically generated penalty receipt that takes the place of the physical paper receipts and helps in digitizing the whole process of collecting challans and penalizing violations.
In this project, we analyze the set of all unpaid E-Challans collected in metropolitan cities over a large span of time to gain unique insights about the nature of traffic violations in such cities. The problem is very relevant for a course on Big Data & Policing as it tries to answer the following important questions:
  1. How are traffic violations distributed spatially and temporally across the city boundaries?
  2. Can the most common violation types be characterized and be used for providing intervention insights?
  3. How can police leverage social media for increasing awareness and for targetted intervention measures?
The problem of characterizing traffic violations has serious implications on the general society as there are more than 1.3 million road accidents in India in the last decade. 

Data collection:

We collect about four years worth of data for the city of Ahmedabad. The city of Ahmedabad was chosen in particular because of the easy availability of data and lack of Captcha in their portal. The Ahmedabad police provide all the E-Challan data on their website https://payahmedabadechallan.org/ and we can obtain the challan corresponding to a particular vehicle no by entering its license plate number. We use selenium headless browser to make repeated requests with all possible vehicle plate numbers and collect the data from the results returned. The data has several fields and one of them is the violation type which includes violations like driving without helmets, wrong parking, no seatbelt etc. In total, the data that we collect has the following fields:
  • Vehicle Number
  • Data of Issuing Challan
  • Place of Issue
  • Violation Type 
  • Fine Amount
E- Challan

Dataset Statistics:

The data was collected over a period of 15 days and the brief dataset statistics are given below:
Descriptive Statistics

Analysis:

We perform our analysis over 3 main domains: 1) Temporal Analysis, 2) Spatial Analysis & 3) Violation Distribution. Thus, the data can be used to answer questions like how to deploy forces at a particular area of the city at a given time (Weekend or weekdays) and for a particular type of violation (e.g. Driving without the helmet). 
In order to perform temporal analysis, we first geocode the locations provided and generate a heatmap of the traffic violations for the city of Ahmedabad. The heatmap given below can be used to infer the hotspots of traffic violations in the city. We can see from the below heatmap that most of the violations are clustered in certain regions given by the bright red regions. Such heatmaps can be made for each type of violation.

Heatmap of traffic violations
We similarly perform temporal analysis to know how traffic violations vary over time. The below plot shows the variation of the number of challans for each day over a period of 4 years from 2015 to 2019. The plot reveals certain interesting trends regarding the variation of traffic violations during the festival dates. There are peaks on the day immediately succeeding the festivals like Diwali which indicates that most of the people do not travel during these holidays or go for outstation trips. 
Time series plot of E-Challan
The third aspect of our analysis deals with the distribution of violations and their contribution to the total money owed. We can infer from the pie chart below that the majority of challans consist of violations like No helmet and Red light violations. Thus, reducing these two violations itself would make the roads much safer. 
Violation Distribution
The above two violations also contribute the greatest to the total amount of money owed to the government.  The plot below tells us the top 5 violations based on the amount owed to the government and the relative scale of the amount owed. 

Characterizing user behaviour:

In order to undertake targeted intervention behaviour, it is necessary to characterize user behaviour and find repeated offenders. In our dataset, there were a large number of repeat offenders and one of them even had 67 E-Challans to their name for the same type of violations. Catching hold of such offenders is the easiest way to tackle the menace of traffic violations. To characterize the repeat offence behaviour, we plot the distribution of the number of vehicles vs the number of challans. From the below plot, it can be reasoned that there is a significant number of users that commit the offences repeatedly (> 25%).
No. of Challans vs Vehicles
To further characterize user behaviour, we collect data again after 3 months to acquire statistics about the average repayment time and corresponding distribution. The data collected revealed some shocking stats such as:
  1. Only 3.5% of the people paid some of their challans in 3 months
  2. The average time of Challan payment was a whopping 339 days (Approx 11 months)
  3. For 13.25% of the people, the number of Challans increased in the 3 months, thus, indicating repeat offence behaviour
  4. Of the people who paid their challans, 10% of the people chose to not pay all of their challans at a given time
These statistics reveal the lack of knowledge of the people about E-Challans in general or the lack of incentive to pay the challan amount as soon as possible. Thus, the effect of E-Challans is diminished significantly. 

Suggesting for the police:

Based on the above data analysis of E-Challan data over a period of 4 years, we would like to make the following suggestions to the Ahmedabad police and the state government
  1. Deploy traffic police in areas that are major hotspots as given in the above spatial heatmap
  2. The traffic police need to be more attentive on the days preceding and succeeding the festivals
  3. There is a general lack of awareness amongst people regarding the system of E-Challans and this needs to be solved using awareness campaigns and social media
  4. There is no incentive for the people to pay their challans early and on time, thus, disincentives like interest on challan amount can be used
There are a lot more plots and other interesting analysis that has been done using this data. For more information or analysis, mail us at shashank[dot]s[at]research[dot]iiit[dot]ac[dot]in

Comments

Popular posts from this blog

BSafe

Problem Statement The course Big Data and Policing  has given us a detailed account about the prominence of Data and how it can influence Policing and general safety.  We as students had the chance to attend talks from policemen to lawyers who discussed their role in collecting and analysing data of any form to conduct policing in a smarter way. Our focus was to try and develop something that can tackle the issue of safety and provide a service that helps in general policing. We decided to come up with an application that could aid the process. Preliminary Idea  We started off with the idea to develop a web and mobile application primarily intended for women safety. We wanted to collect data about narrow streets and roads and understand how unsafe it would be for women mainly as pedestrians. The application allows the users to mark a particular spot on the street which they deem as unsafe. It also allows them to enter a short description about the area and

Human Trafficking dataset creation & analysis

Introduction The goal of this project is to create a Human Trafficking dataset from reliable sources such as news articles, Government agencies, etc and analyse the pain points in this area. Motivation   What is human trafficking? Human trafficking involves recruitment, harbouring or transporting people into a situation of exploitation through the use of violence, deception or coercion and forced to work against their will. In other words, trafficking is a process of enslaving people, coercing them into a situation with no way out, and exploiting them. What is it important?   Did you know that in 2015 alone, Human Trafficking generated $150 billion, more revenue  than Google, Nike, The NFL and Starbucks combined ?!?!   Sounds crazy right? Well there is more to this story than you know, that's why 18th of October is the EU Anti-Trafficking Day.According to a September 2017 report from the International Labor Organization (ILO) and Walk Free Foundation:   An es

InstaBully

Introduction Cyber bullying has become prevalent in today's social media driven world. Awareness about it however, is not very widespread. Given that there is usually no escape for cyber bullying victims from their bullies, it is even more devastating than traditional bullying. Sometimes it is also hard to distinguish between simple negative interactions and cyber-bullying. Keeping this in mind we wanted to create a program that would help detect cyber bullying on Instagram accounts given only a username. Relevance In India, nearly 40% of people have never heard of cyber-bullying. Furthermore a majority of people think that current cyber-bullying measures are insufficient. 45% of parents say that their children have been cyber-bullied. Out of all the various ways in which people can be bullied online social media is the most common and also the most personal.  Although the nature of the bullying changes from platform to platform the effect does not change. we picked