A transformer which can randomly augment VOC format dataset (both image and bbox) online.

Overview

VocAug

It is difficult to find a script which can augment VOC-format dataset, especially the bbox. Or find a script needs complex requirements so it is hard to use. Or, it is offline but not online so it needs very very large disk volume.

Here, is a simple transformer which can randomly augment VOC format dataset online! It can work with only numpy and cv2 packages!

The highlight is,

  1. it augments both image and b-box!!!
  2. it only use cv2 & numpy, means it could be used simply without any other awful packages!!!
  3. it is an online transformer!!!

It contains methods of:

  1. Random HSV augmentation
  2. Random Cropping augmentation
  3. Random Flipping augmentation
  4. Random Noise augmentation
  5. Random rotation or translation augmentation

All the methods can adjust abundant arguments in the constructed function of class VocAug.voc_aug.

Here are some visualized examples:

(click to enlarge)

e.g. #1 e.g. #2
eg1 eg2

More

This script was created when I was writing YOLOv1 object detectin algorithm for learning and entertainment. See more details at https://github.com/BestAnHongjun/YOLOv1-pytorch

Quick Start

1. Download this repo.

git clone https://github.com/BestAnHongjun/VOC-Augmentation.git

or you can download the zip file directly.

2. Enter project directory

cd VOC-Augmentation

3. Install the requirements

pip install -r requirements.txt

For some machines with mixed environments, you need to use pip3 but not pip.

Or you can install the requirements by hand. The default version is ok.

pip install numpy
pip install opencv-python
pip install opencv-contrib-python
pip install matplotlib

4.Create your own project directory

Create your own project directory, then copy the VocAug directory to yours. Or you can use this directory directly.

5. Create your own demo.py file

Or you can use my demo.py directly.

Thus, you should have a project directory with structure like this:

Project_Dir
  |- VocAug (dir)
  |- demo.py

Open your demo.py.

First, import some system packages.

import os
import matplotlib.pyplot as plt

Second, import my VocAug module in your project directory.

from VocAug.voc_aug import voc_aug
from VocAug.transform.voc2vdict import voc2vdict
from VocAug.utils.viz_bbox import viz_vdict

Third, Create two transformer.

voc2vdict_transformer = voc2vdict()
augmentation_transformer = voc_aug()

For the class voc2vdict, when you call its instance with args of xml_file_path and image_file_path, it can read the xml file and the image file and then convert them to VOC-format-dict, represented by vdict.

What is vdict? It is a python dict, which has a structure like:

vdict = {
    "image": numpy.array([[[....]]]),   # Cv2 image Mat. (Shape:[h, w, 3], RGB format)
    "filename": 000048,                 # filename without suffix
    "objects": [{                       # A list of dicts representing b-boxes
        "class_name": "house",
        "class_id": 2,                  # index of self.class_list
        "bbox": (x_min, y_min, x_max, y_max)
    }, {
        ...
    }]
}

For the class voc_aug, when you call its instance by args of vdict, it can augment both image and bbox of the vdict, then return a vdict augmented.

It will randomly use augmentation methods include:

  1. Random HSV augmentation
  2. Random Cropping augmentation
  3. Random Flipping augmentation
  4. Random Noise augmentation
  5. Random rotation or translation augmentation

Then, let's augment the vdict.

# prepare the xml-file-path and the image-file-path
filename = "000007"
file_dir = os.path.join(os.path.dirname(os.path.abspath(__file__)), "dataset")
xml_file_path = os.path.join(file_dir, "Annotations", "{}.xml".format(filename))
image_file_path = os.path.join(file_dir, "JPEGImages", "{}.jpg".format(filename))

# Firstly convert the VOC format xml&image path to VOC-dict(vdict), then augment it.
src_vdict = voc2vdict_transformer(xml_file_path, image_file_path)
image_aug_vdict = augmentation_transformer(src_vdict)

The 000007.jpg and 000007.xml is in the dataset directory under Annotations and JPEGImages separately.

Then you can visualize the vdict. I have prepare a tool for you. That is viz_vdict function in VocAug.utils.viz_bbox module. It will return you a cv2 image when you input a vdict into it.

You can use it like:

image_src = src_vdict.get("image")
image_src_with_bbox = viz_vdict(src_vdict)

image_aug = image_aug_vdict.get("image")
image_aug_with_bbox = viz_vdict(image_aug_vdict)

Visualize them by matplotlib.

plt.figure(figsize=(15, 10))
plt.subplot(2, 2, 1)
plt.title("src")
plt.imshow(image_src)
plt.subplot(2, 2, 3)
plt.title("src_bbox")
plt.imshow(image_src_with_bbox)
plt.subplot(2, 2, 2)
plt.title("aug")
plt.imshow(image_aug)
plt.subplot(2, 2, 4)
plt.title("aug_bbox")
plt.imshow(image_aug_with_bbox)
plt.show()

Then you will get a random result like this. eg1

For more detail see demo.py .

Detail of Algorithm

I am writing this part...

Owner
Coder.AN
Researcher, CoTAI Lab, Dalian Maritime University. Focus on Computer Vision, Moblie Vision, and Machine Learning. Contact me at
Coder.AN
Small-bets - Ergodic Experiment With Python

Ergodic Experiment Based on this video. Run this experiment with this command: p

Michael Brant 3 Jan 11, 2022
Vikrant Deshpande 1 Nov 17, 2022
Learning hierarchical attention for weakly-supervised chest X-ray abnormality localization and diagnosis

Hierarchical Attention Mining (HAM) for weakly-supervised abnormality localization This is the official PyTorch implementation for the HAM method. Pap

Xi Ouyang 22 Jan 02, 2023
A real-time approach for mapping all human pixels of 2D RGB images to a 3D surface-based model of the body

DensePose: Dense Human Pose Estimation In The Wild Rฤฑza Alp Gรผler, Natalia Neverova, Iasonas Kokkinos [densepose.org] [arXiv] [BibTeX] Dense human pos

Meta Research 6.4k Jan 01, 2023
[NeurIPS2021] Code Release of K-Net: Towards Unified Image Segmentation

K-Net: Towards Unified Image Segmentation Introduction This is an official release of the paper K-Net:Towards Unified Image Segmentation. K-Net will a

Wenwei Zhang 423 Jan 02, 2023
All the essential resources and template code needed to understand and practice data structures and algorithms in python with few small projects to demonstrate their practical application.

Data Structures and Algorithms Python INDEX 1. Resources - Books Data Structures - Reema Thareja competitiveCoding Big-O Cheat Sheet DAA Syllabus Inte

Shushrut Kumar 129 Dec 15, 2022
CLADE - Efficient Semantic Image Synthesis via Class-Adaptive Normalization (TPAMI 2021)

Efficient Semantic Image Synthesis via Class-Adaptive Normalization (Accepted by TPAMI)

tzt 49 Nov 17, 2022
The missing CMake project initializer

cmake-init - The missing CMake project initializer Opinionated CMake project initializer to generate CMake projects that are FetchContent ready, separ

1k Jan 01, 2023
Jittor 64*64 implementation of StyleGAN

StyleGanJittor (Tsinghua university computer graphics course) Overview Jittor 64

Song Shengyu 3 Jan 20, 2022
QuanTaichi evaluation suite

QuanTaichi: A Compiler for Quantized Simulations (SIGGRAPH 2021) Yuanming Hu, Jiafeng Liu, Xuanda Yang, Mingkuan Xu, Ye Kuang, Weiwei Xu, Qiang Dai, W

Taichi Developers 120 Jan 04, 2023
A Python module for the generation and training of an entry-level feedforward neural network.

ff-neural-network A Python module for the generation and training of an entry-level feedforward neural network. This repository serves as a repurposin

Riadh 2 Jan 31, 2022
๐Ÿ…๐Ÿ…๐Ÿ…YOLOv5-Lite: lighter, faster and easier to deploy. Evolved from yolov5 and the size of model is only 1.7M (int8) and 3.3M (fp16). It can reach 10+ FPS on the Raspberry Pi 4B when the input size is 320ร—320~

YOLOv5-Lite๏ผšlighter, faster and easier to deploy Perform a series of ablation experiments on yolov5 to make it lighter (smaller Flops, lower memory, a

pogg 1.5k Jan 05, 2023
๐Ÿฅ‡ LG-AI-Challenge 2022 1์œ„ ์†”๋ฃจ์…˜ ์ž…๋‹ˆ๋‹ค.

LG-AI-Challenge-for-Plant-Classification Dacon์—์„œ ์ง„ํ–‰๋œ ๋†์—… ํ™˜๊ฒฝ ๋ณ€ํ™”์— ๋”ฐ๋ฅธ ์ž‘๋ฌผ ๋ณ‘ํ•ด ์ง„๋‹จ AI ๊ฒฝ์ง„๋Œ€ํšŒ ์— ๋Œ€ํ•œ ์ฝ”๋“œ์ž…๋‹ˆ๋‹ค. (colab directory์— ์ฝ”๋“œ๊ฐ€ ์ž˜ ์ •๋ฆฌ ๋˜์–ด์žˆ์Šต๋‹ˆ๋‹ค.) Requirements python

siwooyong 10 Jun 30, 2022
TensorFlow implementation of Elastic Weight Consolidation

Elastic weight consolidation Introduction A TensorFlow implementation of elastic weight consolidation as presented in Overcoming catastrophic forgetti

James Stokes 67 Oct 11, 2022
Pytorch implementation of MalConv

MalConv-Pytorch A Pytorch implementation of MalConv Desciprtion This is the implementation of MalConv proposed in Malware Detection by Eating a Whole

Alexander H. Liu 58 Oct 26, 2022
Stable Neural ODE with Lyapunov-Stable Equilibrium Points for Defending Against Adversarial Attacks

Stable Neural ODE with Lyapunov-Stable Equilibrium Points for Defending Against Adversarial Attacks Stable Neural ODE with Lyapunov-Stable Equilibrium

Kang Qiyu 8 Dec 12, 2022
This code is an unofficial implementation of HiFiSinger.

HiFiSinger This code is an unofficial implementation of HiFiSinger. The algorithm is based on the following papers: Chen, J., Tan, X., Luan, J., Qin,

Heejo You 87 Dec 23, 2022
EfficientNetV2 implementation using PyTorch

EfficientNetV2-S implementation using PyTorch Train Steps Configure imagenet path by changing data_dir in train.py python main.py --benchmark for mode

Jahongir Yunusov 86 Dec 29, 2022
M3DSSD: Monocular 3D Single Stage Object Detector

M3DSSD: Monocular 3D Single Stage Object Detector Setup pytorch 0.4.1 Preparation Download the full KITTI detection dataset. Then place a softlink (or

mumianyuxin 64 Dec 27, 2022
Malware Env for OpenAI Gym

Malware Env for OpenAI Gym Citing If you use this code in a publication please cite the following paper: Hyrum S. Anderson, Anant Kharkar, Bobby Fila

ENDGAME 563 Dec 29, 2022