Complete portable pipeline for masking of Aadhaar Number adhering to Govt. Privacy Guidelines.

Overview

Aadhaar Number Masking Pipeline

Implementation of a complete pipeline that masks the Aadhaar Number in given images to adhere to Govt. of India's Privacy Guidelines for storage of Aadhaar Card images in digital repository. The following project was carried out as an internship for Muthoot Finance. We make use of the open source packages CRAFT text detector | Paper | Pretrained Model | Github Repo provided by Clova AI Research for OSD and combine a heurestic model with pytesseract OCR for masking.

Rohit Ranjan, Ram Sundaram.

Sample Results

teaser

Versions

The search for the best masking pipeline led us to experiment with several different approaches. We have documented our experiments in other branches.

Branch(->model) Speed/Performance Pipeline
main Best performing CRAFT + pytesseract + dimensional heuristics
CNN_OCR->cnn_model Fastest masking CRAFT + LeNet trained by us
CNN_OCR->cnn_model_2 Fastest masking CRAFT + LeNet trained by us
UNET_OCDR Theoretically Fastest but trained model unavailable** UNet

**We proposed and implemented a pipeline which uses a single UNet model for achieving a desirable mask. A single model would have made the inference very fast and real time use capable on mobile devices. Training meant creating a dataset since the company could not legally provide us the needed data. After several trials, we halted work on this model because with barely 150 unique datapoints available, a data hungry UNet Model is simply unsatiable for now.

Datasets

Lenet was trained on our self-created labelled dataset | labels.

Getting started

Install dependencies

Requirements

  • torch
  • opencv-python
  • tesseract-ocr
  • check requirements.txt
pip install -r requirements.txt

Test instructions

  • Clone this repository
git clone https://github.com/thefurorjuror/Aadhaar_Masker.git
  • Run on an image folder
python [folder path to the cloned repo]/masker.py --test_folder=[folder path to test images] --output_folder=[folder path to output images] --cuda=[True/False]
#Example- When one is inside the cloned repo
python masker.py --test_folder=./images/ --output_folder=./output/ --cuda=True

cuda is set to False by default.

Citation

@inproceedings{baek2019character,
  title={Character Region Awareness for Text Detection},
  author={Baek, Youngmin and Lee, Bado and Han, Dongyoon and Yun, Sangdoo and Lee, Hwalsuk},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={9365--9374},
  year={2019}
}

License

Copyright (c) 2021-present Rohit Ranjan & Ram Sundaram.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
Periodically check the manuscript state in the scholar one system and send email when finding a new state.

ScholarOne-manuscript-checker Periodically check the manuscript state in the scholar one system and send email when finding a new state. Parameters ne

2 Aug 18, 2022
An unofficial library for discord components (under-development)

discord-components An unofficial library for discord components (under-development) Welcome! Discord components are cool, but discord.py will support

11 Jun 14, 2021
PS3API - PS3 API for TMAPI and CCAPI in python.

PS3API PS3 API for TMAPI and CCAPI in python. Examples Connecting and Attaching from ps3api import PS3API PS3 = PS3API(PS3API.API_TMAPI) if PS3.Conn

Adam 9 Sep 01, 2022
Basic query to access Traindex API

traindex-api-query Basic query to access Traindex API Please make sure to provide your Traindex API key to the line 8 in the script file. There are tw

Foretheta 1 Nov 11, 2021
Análise de dados abertos do programa Taxigov.

Análise de dados do Taxigov Este repositório contém os cadernos Jupyter usados no projeto de análise de dados do Taxigov. Conjunto de dados O conjunto

Augusto Herrmann 1 Jan 10, 2022
Optimus Prime - A modular Telegram group management and drive clone bot running on Python with sqlalchemy database

Optimus Prime Bot . 🤖 A modular Telegram group management and drive clone bot r

9 Jun 01, 2022
A stable and Fast telegram video convertor bot which can compress, convert(video into audio and other video formats), rename with permanent thumbnail and trim.

ᴠɪᴅᴇᴏ ᴄᴏɴᴠᴇʀᴛᴏʀ A stable and Fast telegram video convertor bot which can compress, convert(video into audio and other video formats), rename and trim.

Mahesh Chauhan 183 Jan 04, 2023
The Encoder Bot For Python

The_Encoder_Bot Configuration Add values in environment variables or add them in config.env.example and rename file to config.env. Basics API_ID - Get

8 Jan 01, 2022
Quack-SMS-BOMBER - Quack Toolkit By IkigaiHack

Quack Toolkit By IkigaiHack About Quack Toolkit Quack Toolkit is a set of tools

Marcel 2 Aug 19, 2022
Discord Webhook Spammer (fastest)

Discord Webhook Spammer A simple fast asynchronous webhook spammer. Spammer Features Fast message spamming. Controllable speed. Noob friendly. Usage N

Varient 2 Apr 22, 2022
A telegram bot providing recon and research functions for bug bounty research

Bug Bounty Bot A telegram bot with commands to simplify bug bounty tasks Installation Use Road Map Installation BugBountyBot is open-source so you can

Tyler Butler 1 Oct 23, 2021
自用直播源集合,附带检测与分类功能。

myiptv 自用直播源集合,附带检测与分类功能。 为啥搞 TLDR: 太闲了。 自己有收集直播源的爱好,和录制直播源的需求。 一些软件自带的直播源太过难用。 网上现有的直播源太杂,且缺乏检测。 一些大源缺乏持续更新,如 iptv-org。 使用指南与 TODO 每次进行大更新后都会进行一次 rel

abc1763613206 171 Dec 11, 2022
Anti-league-discordbot - Harrasses imbeciles for playing league of legends

anti-league-discordbot harrasses imbeciles for playing league of legends Running

Chris Clem 2 Feb 12, 2022
A cool discord bot, called Fifi

Fifi A cool discord bot, called Fifi This bot is the official server bot of Meme Studios discord server. This github repo is the code we use for the b

Fifi Discord Bot 3 Jun 08, 2021
Der Dischkort Bot für Andiismus

AndreOS Der Dischkort Bot für Andiismus Wichtigger Bot für den hauseigenen Discord-Server Indoktrinationsmechanismusleitungsprogramm der andiistischen

Leon Bartle 3 Jan 13, 2022
Python wrapper library for World Weather Online API

pywwo Python wrapper library for World Weather Online API using lxml.objectify How to use from pywwo import * setKey('your_key', 'free') w=LocalWeat

World Weather Online 20 Dec 19, 2022
A Advanced Auto Filter Bot Which Can Be Used In Many Groups With Multiple Channel Support....

Adv Auto Filter Bot This Just A Simple Hand Auto Filter Bot For Searching Files From Channel... Just Sent Any Text I Will Search In All Connected Chat

Albert Einstein 33 Oct 21, 2022
Telegram Bot for updating ongoing matches of Fotmob.com in channel by @AbirHasan2005

Fotmob-Bot A very simple Telegram Bot which will update ongoing matches of Fotmob in a channel. Demo Channel Configs API_ID - Get this from @TeleORG_B

Abir Hasan 22 Oct 21, 2022
Slack bot for monitoring your Metaflow flows!

Metaflowbot - Slack Bot for your Metaflow flows! Metaflowbot makes it fun and easy to monitor your Metaflow runs, past and present. Imagine starting a

Outerbounds 21 Dec 07, 2022
Morpy Bot Linux - Morpy Bot Linux With Python

Morpy_Bot_Linux Guide to using the robot : 🔸 Lsmod = to identify admins and st

2 Jan 20, 2022