PyTorch implementation of the YOLO (You Only Look Once) v2

Overview

PyTorch implementation of the YOLO (You Only Look Once) v2

The YOLOv2 is one of the most popular one-stage object detector. This project adopts PyTorch as the developing framework to increase productivity, and utilize ONNX to convert models into Caffe 2 to benefit engineering deployment. If you are benefited from this project, a donation will be appreciated (via PayPal, 微信支付 or 支付宝).

Designs

  • Flexible configuration design. Program settings are configurable and can be modified (via configure file overlaping (-c/--config option) or command editing (-m/--modify option)) using command line argument.

  • Monitoring via TensorBoard. Such as the loss values and the debugging images (such as IoU heatmap, ground truth and predict bounding boxes).

  • Parallel model training design. Different models are saved into different directories so that can be trained simultaneously.

  • Using a NoSQL database to store evaluation results with multiple dimension of information. This design is useful when analyzing a large amount of experiment results.

  • Time-based output design. Running information (such as the model, the summaries (produced by TensorBoard), and the evaluation results) are saved periodically via a predefined time.

  • Checkpoint management. Several latest checkpoint files (.pth) are preserved in the model directory and the older ones are deleted.

  • NaN debug. When a NaN loss is detected, the running environment (data batch) and the model will be exported to analyze the reason.

  • Unified data cache design. Various dataset are converted into a unified data cache via corresponding cache plugins. Some plugins are already implemented. Such as PASCAL VOC and MS COCO.

  • Arbitrarily replaceable model plugin design. The main deep neural network (DNN) can be easily replaced via configuration settings. Multiple models are already provided. Such as Darknet, ResNet, Inception v3 and v4, MobileNet and DenseNet.

  • Extendable data preprocess plugin design. The original images (in different sizes) and labels are processed via a sequence of operations to form a training batch (images with the same size, and bounding boxes list are padded). Multiple preprocess plugins are already implemented. Such as augmentation operators to process images and labels (such as random rotate and random flip) simultaneously, operators to resize both images and labels into a fixed size in a batch (such as random crop), and operators to augment images without labels (such as random blur, random saturation and random brightness).

Feautures

  • Reproduce the original paper's training results.
  • Multi-scale training.
  • Dimension cluster.
  • Darknet model file (.weights) parser.
  • Detection from image and camera.
  • Processing Video file.
  • Multi-GPU supporting.
  • Distributed training.
  • Focal loss.
  • Channel-wise model parameter analyzer.
  • Automatically change the number of channels.
  • Receptive field analyzer.

Quick Start

This project uses Python 3. To install the dependent libraries, type the following command in a terminal.

sudo pip3 install -r requirements.txt

quick_start.sh contains the examples to perform detection and evaluation. Run this script. Multiple datasets and models (the original Darknet's format, will be converted into PyTorch's format) will be downloaded (aria2 is required). These datasets are cached into different data profiles, and the models are evaluated over the cached data. The models are used to detect objects in an example image, and the detection results will be shown.

License

This project is released as the open source software with the GNU Lesser General Public License version 3 (LGPL v3).

Owner
申瑞珉 (Ruimin Shen)
申瑞珉 (Ruimin Shen)
Instance-Dependent Partial Label Learning

Instance-Dependent Partial Label Learning Installation pip install -r requirements.txt Run the Demo benchmark-random mnist python -u main.py --gpu 0 -

17 Dec 29, 2022
Towards Open-World Feature Extrapolation: An Inductive Graph Learning Approach

This repository holds the implementation for paper Towards Open-World Feature Extrapolation: An Inductive Graph Learning Approach Download our preproc

Qitian Wu 42 Dec 27, 2022
Ultra-lightweight human body posture key point CNN model. ModelSize:2.3MB HUAWEI P40 NCNN benchmark: 6ms/img,

Ultralight-SimplePose Support NCNN mobile terminal deployment Based on MXNET(=1.5.1) GLUON(=0.7.0) framework Top-down strategy: The input image is t

223 Dec 27, 2022
Official repository for: Continuous Control With Ensemble DeepDeterministic Policy Gradients

Continuous Control With Ensemble Deep Deterministic Policy Gradients This repository is the official implementation of Continuous Control With Ensembl

4 Dec 06, 2021
Funnels: Exact maximum likelihood with dimensionality reduction.

Funnels This repository contains the code needed to reproduce the experiments from the paper: Funnels: Exact maximum likelihood with dimensionality re

2 Apr 21, 2022
The source code for the Cutoff data augmentation approach proposed in this paper: "A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation".

Cutoff: A Simple Data Augmentation Approach for Natural Language This repository contains source code necessary to reproduce the results presented in

Dinghan Shen 49 Dec 22, 2022
JORLDY an open-source Reinforcement Learning (RL) framework provided by KakaoEnterprise

Repository for Open Source Reinforcement Learning Framework JORLDY

Kakao Enterprise Corp. 330 Dec 30, 2022
Tooling for the Common Objects In 3D dataset.

CO3D: Common Objects In 3D This repository contains a set of tools for working with the Common Objects in 3D (CO3D) dataset. Download the dataset The

Facebook Research 724 Jan 06, 2023
An open-source Kazakh named entity recognition dataset (KazNERD), annotation guidelines, and baseline NER models.

Kazakh Named Entity Recognition This repository contains an open-source Kazakh named entity recognition dataset (KazNERD), named entity annotation gui

ISSAI 9 Dec 23, 2022
Official Pytorch implementation of "CLIPstyler:Image Style Transfer with a Single Text Condition"

CLIPstyler Official Pytorch implementation of "CLIPstyler:Image Style Transfer with a Single Text Condition" Environment Pytorch 1.7.1, Python 3.6 $ c

201 Dec 29, 2022
Code for "Diversity can be Transferred: Output Diversification for White- and Black-box Attacks"

Output Diversified Sampling (ODS) This is the github repository for the NeurIPS 2020 paper "Diversity can be Transferred: Output Diversification for W

50 Dec 11, 2022
Collaborative forensic timeline analysis

Timesketch Table of Contents About Timesketch Getting started Community Contributing About Timesketch Timesketch is an open-source tool for collaborat

Google 2.1k Dec 28, 2022
Predicting a person's gender based on their weight and height

Logistic Regression Advanced Case Study Gender Classification: Predicting a person's gender based on their weight and height 1. Introduction We turn o

1 Feb 01, 2022
Python scripts for performing road segemtnation and car detection using the HybridNets multitask model in ONNX.

ONNX-HybridNets-Multitask-Road-Detection Python scripts for performing road segemtnation and car detection using the HybridNets multitask model in ONN

Ibai Gorordo 45 Jan 01, 2023
Public repo for the ICCV2021-CVAMD paper "Is it Time to Replace CNNs with Transformers for Medical Images?"

Is it Time to Replace CNNs with Transformers for Medical Images? Accepted at ICCV-2021: Workshop on Computer Vision for Automated Medical Diagnosis (C

Christos Matsoukas 80 Dec 27, 2022
Python Assignments for the Deep Learning lectures by Andrew NG on coursera with complete submission for grading capability.

Python Assignments for the Deep Learning lectures by Andrew NG on coursera with complete submission for grading capability.

Utkarsh Agiwal 1 Feb 03, 2022
PyTorch implementation of "PatchGame: Learning to Signal Mid-level Patches in Referential Games" to appear in NeurIPS 2021

PatchGame: Learning to Signal Mid-level Patches in Referential Games This repository is the official implementation of the paper - "PatchGame: Learnin

Kamal Gupta 22 Mar 16, 2022
Face Alignment using python

Face Alignment Face Alignment using python Input Image Aligned Face Aligned Face Aligned Face Input Image Aligned Face Input Image Aligned Face Instal

Sajjad Aemmi 28 Nov 23, 2022
PyTorch Implementation of Spatially Consistent Representation Learning(SCRL)

Spatially Consistent Representation Learning (CVPR'21) Official PyTorch implementation of Spatially Consistent Representation Learning (SCRL). This re

Kakao Brain 102 Nov 03, 2022
RAMA: Rapid algorithm for multicut problem

RAMA: Rapid algorithm for multicut problem Solves multicut (correlation clustering) problems orders of magnitude faster than CPU based solvers without

Paul Swoboda 60 Dec 13, 2022