Skip to main content

Stolen Vehicle Identification and Number plate detection





What are we trying to do?

  • With vehicle thefts increasing to alarming levels, our platform allows in the identification of regions with high thefts and provides a medium to detect stolen vehicles using number plate scanning. The platform can be used to detect stolen vehicles in real-time and the project can be scaled to use live video stream instead of images.
  • We scraped and worked on the vehicle theft data available for Delhi region as a part of this four-month project marathon. The project has been developed in a way that it can be later scaled to a larger domain.

The Data
  • The dataset was obtained from the Delhi police website (https://zipnet.delhipolice.gov.in/). The data was scraped from the Delhi region using Beautiful Soup. We scraped the data for a span of 8 months, but the method can be scaled to updates FIRs on a daily basis.
  • Each FIR is manually curated by a police officer, so the data has a lot of noise and requires to be cleaned before any analysis or work can be done. For Eg: The place of theft is very arbitrary written and cannot be used for any purpose by itself.

Talking places
  • For the analysis part, we wanted locations, a map representation to be precise.
  • This would prove helpful to take in the pinpoints of theft incidents.
  • Hence, the required annotation of coordinates corresponding to the address fields.
  • We used the Google API in order to map the theft locations to the corresponding theft locations.

Analysis
  • Location-based analysis: We used google’s geocode API to get coordinates of places from where vehicles were stolen and then we generated heat-map from those coordinates. Our analysis showed that -
    • Most Dark hotspots were parking of Metro Stations, Hospitals, Malls, etc.
    • Old Delhi area had more dark hotspots than New Delhi.
    • Army Cant Area in Delhi had least hotspots 

      map.png
  • Day/Weekly Analysis of thefts - The day wise analysis shows the day wise frequency of the motor vehicle thefts. In our analysis, we found that -
    • On the weekend there are fewer thefts compared to weekdays (One possible reason we could think of was because of holiday metro parking remain empty as compared to weekdays, that’s why less theft reports.)
  • Date wise Analysis: We analyzed date wise frequency  of FIRs. After analysis we got some unexpected results. We were expecting -
    • At the starting of the month, thefts would be more.
    • At the end of the month, thefts would be more.
    • But we didn't find both the above mentioned, surprisingly we found that theft frequency was almost equally distributed throughout the month.
  • Festival Analysis: We observed that during the Festival Season frequency of thefts increased suddenly. We saw that during:
    • Navratri theft increased by 25% than Normal days.
    • Diwali theft increased by 8% than Normal days.
    • New Year theft increased by 25% than Normal days
.
  • Validation of Results:
    • To validate the results that we got from the heatmap, we generated a word map from all the addresses written in FIR. We got similar results for both word map and heatmap.
  • Our analysis showed 115-125 FIRs of vehicle thefts per day in Delhi, to validate this we found this article which shows similar results that we got in our analysis.
Open ALPR:
LPR sometimes called ALPR (Automatic License Plate Recognition) has 3 major stages.
  1. License Plate Detection: This is the first and probably the most important stage of the system. It is at this stage that the position of the license plate is determined. The input at this stage is an image of the vehicle and the output is the license plate.
  2. Character Segmentation: It’s at this stage the characters on the license plate are mapped out and segmented into individual images.
  3. Character Recognition: This is where we wrap things up. The characters earlier segmented are identified here. We’ll be using machine learning for this.



Output:
  • JSON object (color, number plate, model, registration number)
  • Already been tested on a large dataset of Indian cars, quite accurate predictions.

Combining both :
  1. Took image
  2. Extracted data
  3. Looked up in the database

Future work
  • Implement with a video-based extraction of number plates
  • Live database scraping (daily updates)
  • Scaling to more cities/regions with online for data available for thefts

Where can we get the images from?
  • Toll plazas are excellent examples. (stable video feed)
  • CCTV cameras

Comments

Popular posts from this blog

Traffic Violations in Metropolitan Cities

Introduction With the advent of the smartphone era and the availability of 4G internet across the country, police forces have begun to use electronic receipts of the traditional traffic challans. E-Challans are electronically generated penalty receipt that takes the place of the physical paper receipts and helps in digitizing the whole process of collecting challans and penalizing violations. In this project, we analyze the set of all unpaid E-Challans collected in metropolitan cities over a large span of time to gain unique insights about the nature of traffic violations in such cities. The problem is very relevant for a course on Big Data & Policing as it tries to answer the following important questions: How are traffic violations distributed spatially and temporally across the city boundaries? Can the most common violation types be characterized and be used for providing intervention insights? How can police leverage social media for increasing awareness and for targe...

Real-Time and Predictive Traffic Data Analysis

Introduction Traffic prediction is crucial to many applications including traffic network planning, route guidance, and congestion avoidance. We have tried to minimize the time required for a vehicle to go from point A to point B, and maximize the efficiency of the flow of traffic, to help the traffic police in managing traffic. Several essential factors affect traffic prediction: Geographical factors such as topology, etc. Social factors such as holidays, concert, weekends, etc. Limited Dataset, i.e., either small or not a publicly available dataset. The primary aim of the project is to use historical and live traffic data to control the traffic lights for efficient traffic flow. Why is the problem statement important? The number of vehicles on the road in India have increased 2-fold in every 8 years since the year 2000. Apart from not having adequately constructed roads, there is no proper system for helping traffic police officers in controlling the flow of traffic...

Detecting Vulnerable regions in metropolitan cities

Introduction The problem is to handle the growing violence rate by estimating the probability of the upcoming violence, especially in metropolitan cities. Why is the problem important? This is important since if by doing so, we could somehow able to stop even 10-15% of upcoming threat then it can have a vast effect. Who will benefit : Police can analyze data in real time and may increase patrolling if required. Based on available data, police can effectively maintain law and order in  vulnerable areas. Our strategy For this we chose the social media platform twitter 1) First of all we collected tweets with geo tagged locations for the last 7 days for 4 citites hyderabad, mumbai, kolkata and delhi 2) But only 2% of total tweets have geo tagged locations. So what we have done is that, we made a dictionary of areas of these cities from maps of india and find   the location if it is mentioned in the tweet like My bag is stolen from CP D...