Arabic Car License Recognition. A solution to the kaggle competition Machathon 3.0.

Overview

Transformers

Arabic licence plate recognition πŸš—

  • Solution to the kaggle competition Machathon 3.0.
  • Ranked in the top 6️⃣ at the final evaluation phase.
  • Check our solution now on collab!
  • Check the solution presentation

Preprocessing Pipeline

The schematic of the processor

Approach

Step1: Preprocessing Enhancments on the image.

  • Most images had bad illumination and noise
    • Morphological operations to Maximize Contrast.
    • Gaussian Blur to remove Noise.
  • Thresholding on both Value and Saturation channels.

Step2: Extracting white plate using countours.

  • Get countours and sort based on Area.
  • Polygon Approximation For noisy countours.
  • Convex hull for Concave polygons.
  • 4-Point transformation For difficult camera angles.

Now have numbers in a countor and letters in another.

Step3: Separating characters from white plate using sliding windows.

Can't use countours to get symbols in white plate since Arabic Letter may consist of multiple charachters e.g Ψͺ this may consist of 2/3 countours.

Solution

  • Tuned 2 sliding windows, one for letters' white plate, the other for numbers.
    • Variable window width
    • Window height is the white plate height, since arabic characters may consist multiple parts
  • Selecting which window
    • Must have no black pixels on the sides
    • Must have a specific range of black pixels inside
    • For each group of windows the one with max black pixels is selected

Step4: Character Recognition.

  • Training 2 model since Arabic letters and numbers are similar e.g (Ψ£,1) (5, Ω‡)
    • one for classifing only arabic letters.
    • one for classifying arabic numbers.

Project Organization

Scripts applied on images

./Macathon/code/
β”œβ”€β”€ extract_bbx_xml.ipynb                       : Takes directory of images and their bbx data stored in an xml files, and crop the bbxs from the images.
|                                                 The xml file contains licence label(name), xmin, ymin, xmax, ymax of the bbxs in an image.    
β”œβ”€β”€ extract_bbx_txt.ipynb                       : Takes directory of images and their bbx data stored in a txt files, and crop the bbxs from the images.
|                                                 The txt file corresponding to one image may consist of multiple bbxs, each corresponds to a row of xmin,ymin,xmax,ymax for that bbx.
└── crop_right_noise.ipynb                      : Crops an image with some percentage and replace with the cropped image. 

Model versions

./Macathon/code/
└── model.ipynb                      : - The preprocessing and modeling stage, Contains:
                                          - Preprocessing Functions
                                          - Training both classifers
                                          - Prediction and generating the output csv file

Data Folder

./Macathon/data/
β”œβ”€β”€ challenging_images.rar                      : Contains most challenging images collected from the train data. 
β”œβ”€β”€ cropped_letters.zip                         : 28 Subfolders corresponding to the 28 letter in Arabic alphabet.
|                                                 Each subfolder holds images for the letter it's named after, cropped from the train data distribution.
β”œβ”€β”€ cropped_numbers.zip                         : 10 Subfolders for the 10 numbers.
|                                                 Each subfolder holds images for the number it's named after, cropped from the train data distribution.
β”œβ”€β”€ machathon-3.zip                             : The uploaded data found with the kaggle competition.
└── testLetters.zip                             : 200 images labeled from the test data distribution.
                                                  Each image has a corresponding xml file holding the bbxs locations in it.

Contributors

This masterpiece was designed, and implemented by

Hossam
Hossam Saeed
Mostafa wael
Mostafa Wael
Nada Elmasry
Nada Elmasry
Noran Hany
Noran Hany
Owner
Noran Hany
Noran Hany
Playable Video Generation

Playable Video Generation Playable Video Generation Willi Menapace, Stéphane Lathuilière, Sergey Tulyakov, Aliaksandr Siarohin, Elisa Ricci Paper: ArX

Willi Menapace 136 Dec 31, 2022
Implementation of "Scaled-YOLOv4: Scaling Cross Stage Partial Network" using PyTorch framwork.

YOLOv4-large This is the implementation of "Scaled-YOLOv4: Scaling Cross Stage Partial Network" using PyTorch framwork. YOLOv4-CSP YOLOv4-tiny YOLOv4-

Kin-Yiu, Wong 2k Jan 02, 2023
Deep Learning Based EDM Subgenre Classification using Mel-Spectrogram and Tempogram Features"

EDM-subgenre-classifier This repository contains the code for "Deep Learning Based EDM Subgenre Classification using Mel-Spectrogram and Tempogram Fea

11 Dec 20, 2022
YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )

Yolo v4, v3 and v2 for Windows and Linux (neural networks for object detection) Paper YOLO v4: https://arxiv.org/abs/2004.10934 Paper Scaled YOLO v4:

Alexey 20.2k Jan 09, 2023
Supercharging Imbalanced Data Learning WithCausal Representation Transfer

ECRT: Energy-based Causal Representation Transfer Code for Supercharging Imbalanced Data Learning With Energy-basedContrastive Representation Transfer

Zidi Xiu 11 May 02, 2022
Implementing Graph Convolutional Networks and Information Retrieval Mechanisms using pure Python and NumPy

Implementing Graph Convolutional Networks and Information Retrieval Mechanisms using pure Python and NumPy

Noah Getz 3 Jun 22, 2022
MINOS: Multimodal Indoor Simulator

MINOS Simulator MINOS is a simulator designed to support the development of multisensory models for goal-directed navigation in complex indoor environ

194 Dec 27, 2022
Python implementation of Lightning-rod Agent, the Stack4Things board-side probe

Iotronic Lightning-rod Agent Python implementation of Lightning-rod Agent, the Stack4Things board-side probe. Free software: Apache 2.0 license Websit

2 May 19, 2022
Clean and readable code for Decision Transformer: Reinforcement Learning via Sequence Modeling

Minimal implementation of Decision Transformer: Reinforcement Learning via Sequence Modeling in PyTorch for mujoco control tasks in OpenAI gym

Nikhil Barhate 104 Jan 06, 2023
A PyTorch implementation: "LASAFT-Net-v2: Listen, Attend and Separate by Attentively aggregating Frequency Transformation"

LASAFT-Net-v2 Listen, Attend and Separate by Attentively aggregating Frequency Transformation Woosung Choi, Yeong-Seok Jeong, Jinsung Kim, Jaehwa Chun

Woosung Choi 29 Jun 04, 2022
Official Pytorch implementation of "Learning Debiased Representation via Disentangled Feature Augmentation (Neurips 2021, Oral)"

Learning Debiased Representation via Disentangled Feature Augmentation (Neurips 2021, Oral): Official Project Webpage This repository provides the off

Kakao Enterprise Corp. 68 Dec 17, 2022
Post-Training Quantization for Vision transformers.

PTQ4ViT Post-Training Quantization Framework for Vision Transformers. We use the twin uniform quantization method to reduce the quantization error on

Zhihang Yuan 61 Dec 28, 2022
Official Implementation of LARGE: Latent-Based Regression through GAN Semantics

LARGE: Latent-Based Regression through GAN Semantics [Project Website] [Google Colab] [Paper] LARGE: Latent-Based Regression through GAN Semantics Yot

83 Dec 06, 2022
Deep motion transfer

animation-with-keypoint-mask Paper The right most square is the final result. Softmax mask (circles): \ Heatmap mask: \ conda env create -f environmen

9 Nov 01, 2022
Implementation of Enformer, Deepmind's attention network for predicting gene expression, in Pytorch

Enformer - Pytorch (wip) Implementation of Enformer, Deepmind's attention network for predicting gene expression, in Pytorch. The original tensorflow

Phil Wang 235 Dec 27, 2022
TransPrompt - Towards an Automatic Transferable Prompting Framework for Few-shot Text Classification

TransPrompt This code is implement for our EMNLP 2021's paper γ€ŠTransPrompt:Towards an Automatic Transferable Prompting Framework for Few-shot Text Cla

WangJianing 23 Dec 21, 2022
A Fast and Accurate One-Stage Approach to Visual Grounding, ICCV 2019 (Oral)

One-Stage Visual Grounding ***** New: Our recent work on One-stage VG is available at ReSC.***** A Fast and Accurate One-Stage Approach to Visual Grou

Zhengyuan Yang 118 Dec 05, 2022
Official Pytorch implementation for AAAI2021 paper (RSPNet: Relative Speed Perception for Unsupervised Video Representation Learning)

RSPNet Official Pytorch implementation for AAAI2021 paper "RSPNet: Relative Speed Perception for Unsupervised Video Representation Learning" [Suppleme

35 Jun 24, 2022
Simple image captioning model - CLIP prefix captioning.

CLIP prefix captioning. Inference Notebook: πŸ₯³ New: πŸ₯³ Our technical papar is finally out! Official implementation for the paper "ClipCap: CLIP Prefix

688 Jan 04, 2023
An essential implementation of BYOL in PyTorch + PyTorch Lightning

Essential BYOL A simple and complete implementation of Bootstrap your own latent: A new approach to self-supervised Learning in PyTorch + PyTorch Ligh

Enrico Fini 48 Sep 27, 2022