Skip to main content

Stolen Vehicle Identification and Number plate detection





What are we trying to do?

  • With vehicle thefts increasing to alarming levels, our platform allows in the identification of regions with high thefts and provides a medium to detect stolen vehicles using number plate scanning. The platform can be used to detect stolen vehicles in real-time and the project can be scaled to use live video stream instead of images.
  • We scraped and worked on the vehicle theft data available for Delhi region as a part of this four-month project marathon. The project has been developed in a way that it can be later scaled to a larger domain.

The Data
  • The dataset was obtained from the Delhi police website (https://zipnet.delhipolice.gov.in/). The data was scraped from the Delhi region using Beautiful Soup. We scraped the data for a span of 8 months, but the method can be scaled to updates FIRs on a daily basis.
  • Each FIR is manually curated by a police officer, so the data has a lot of noise and requires to be cleaned before any analysis or work can be done. For Eg: The place of theft is very arbitrary written and cannot be used for any purpose by itself.

Talking places
  • For the analysis part, we wanted locations, a map representation to be precise.
  • This would prove helpful to take in the pinpoints of theft incidents.
  • Hence, the required annotation of coordinates corresponding to the address fields.
  • We used the Google API in order to map the theft locations to the corresponding theft locations.

Analysis
  • Location-based analysis: We used google’s geocode API to get coordinates of places from where vehicles were stolen and then we generated heat-map from those coordinates. Our analysis showed that -
    • Most Dark hotspots were parking of Metro Stations, Hospitals, Malls, etc.
    • Old Delhi area had more dark hotspots than New Delhi.
    • Army Cant Area in Delhi had least hotspots 

      map.png
  • Day/Weekly Analysis of thefts - The day wise analysis shows the day wise frequency of the motor vehicle thefts. In our analysis, we found that -
    • On the weekend there are fewer thefts compared to weekdays (One possible reason we could think of was because of holiday metro parking remain empty as compared to weekdays, that’s why less theft reports.)
  • Date wise Analysis: We analyzed date wise frequency  of FIRs. After analysis we got some unexpected results. We were expecting -
    • At the starting of the month, thefts would be more.
    • At the end of the month, thefts would be more.
    • But we didn't find both the above mentioned, surprisingly we found that theft frequency was almost equally distributed throughout the month.
  • Festival Analysis: We observed that during the Festival Season frequency of thefts increased suddenly. We saw that during:
    • Navratri theft increased by 25% than Normal days.
    • Diwali theft increased by 8% than Normal days.
    • New Year theft increased by 25% than Normal days
.
  • Validation of Results:
    • To validate the results that we got from the heatmap, we generated a word map from all the addresses written in FIR. We got similar results for both word map and heatmap.
  • Our analysis showed 115-125 FIRs of vehicle thefts per day in Delhi, to validate this we found this article which shows similar results that we got in our analysis.
Open ALPR:
LPR sometimes called ALPR (Automatic License Plate Recognition) has 3 major stages.
  1. License Plate Detection: This is the first and probably the most important stage of the system. It is at this stage that the position of the license plate is determined. The input at this stage is an image of the vehicle and the output is the license plate.
  2. Character Segmentation: It’s at this stage the characters on the license plate are mapped out and segmented into individual images.
  3. Character Recognition: This is where we wrap things up. The characters earlier segmented are identified here. We’ll be using machine learning for this.



Output:
  • JSON object (color, number plate, model, registration number)
  • Already been tested on a large dataset of Indian cars, quite accurate predictions.

Combining both :
  1. Took image
  2. Extracted data
  3. Looked up in the database

Future work
  • Implement with a video-based extraction of number plates
  • Live database scraping (daily updates)
  • Scaling to more cities/regions with online for data available for thefts

Where can we get the images from?
  • Toll plazas are excellent examples. (stable video feed)
  • CCTV cameras

Comments

Popular posts from this blog

Human Trafficking dataset creation & analysis

Introduction The goal of this project is to create a Human Trafficking dataset from reliable sources such as news articles, Government agencies, etc and analyse the pain points in this area. Motivation   What is human trafficking? Human trafficking involves recruitment, harbouring or transporting people into a situation of exploitation through the use of violence, deception or coercion and forced to work against their will. In other words, trafficking is a process of enslaving people, coercing them into a situation with no way out, and exploiting them. What is it important?   Did you know that in 2015 alone, Human Trafficking generated $150 billion, more revenue  than Google, Nike, The NFL and Starbucks combined ?!?!   Sounds crazy right? Well there is more to this story than you know, that's why 18th of October is the EU Anti-Trafficking Day.According to a September 2017 report from the International Labor Organization (ILO) and Walk Free Fo...

InstaBully

Introduction Cyber bullying has become prevalent in today's social media driven world. Awareness about it however, is not very widespread. Given that there is usually no escape for cyber bullying victims from their bullies, it is even more devastating than traditional bullying. Sometimes it is also hard to distinguish between simple negative interactions and cyber-bullying. Keeping this in mind we wanted to create a program that would help detect cyber bullying on Instagram accounts given only a username. Relevance In India, nearly 40% of people have never heard of cyber-bullying. Furthermore a majority of people think that current cyber-bullying measures are insufficient. 45% of parents say that their children have been cyber-bullied. Out of all the various ways in which people can be bullied online social media is the most common and also the most personal.  Although the nature of the bullying changes from platform to platform the effect does not change. we picked...

BSafe

Problem Statement The course Big Data and Policing  has given us a detailed account about the prominence of Data and how it can influence Policing and general safety.  We as students had the chance to attend talks from policemen to lawyers who discussed their role in collecting and analysing data of any form to conduct policing in a smarter way. Our focus was to try and develop something that can tackle the issue of safety and provide a service that helps in general policing. We decided to come up with an application that could aid the process. Preliminary Idea  We started off with the idea to develop a web and mobile application primarily intended for women safety. We wanted to collect data about narrow streets and roads and understand how unsafe it would be for women mainly as pedestrians. The application allows the users to mark a particular spot on the street which they deem as unsafe. It also allows them to enter a short description about...