A state of the art of new lightweight YOLO model implemented by TensorFlow 2.

Last update: Dec 21, 2022

Overview

CSL-YOLO: A New Lightweight Object Detection System for Edge Computing

This project provides a SOTA level lightweight YOLO called "Cross-Stage Lightweight YOLO"(CSL-YOLO),

it is achieving better detection performance with only 43% FLOPs and 52% parameters than Tiny-YOLOv4.

Paper Link: https://arxiv.org/abs/2107.04829

Requirements

How to Get Started?

#Predict
python3 main.py -p cfg/predict_coco.cfg

#Train
python3 main.py -t cfg/train_coco.cfg

#Eval
python3 main.py -ce cfg/eval_coco.cfg

WebCam DEMO(on CPU)

This DEMO runs on a pure CPU environment, the CPU is I7-6600U(2.6Ghz~3.4Ghz), the model scale is 224x224, and the FPS is about 10.

Please execute the following script to get this DEMO, the "camera_idx" in the cfg file represents the camera number you specified.

#Camera DEMO
python3 main.py -d cfg/demo_coco.cfg

More Info

Change Model Scale

The model's default scale is 224x224, if you want to change the scale to 320~512,

please go to cfg/XXXX.cfg and change the following two parts:

# input_shape=[512,512,3]
# out_hw_list=[[64,64],[48,48],[32,32],[24,24],[16,16]]
# input_shape=[416,416,3]
# out_hw_list=[[52,52],[39,39],[26,26],[20,20],[13,13]]
# input_shape=[320,320,3]
# out_hw_list=[[40,40],[30,30],[20,20],[15,15],[10,10]]
input_shape=[224,224,3]
out_hw_list=[[28,28],[21,21],[14,14],[10,10],[7,7]]

weight_path=weights/224_nolog.hdf5

                         |
                         | 224 to 320
                         V
                         
# input_shape=[512,512,3]
# out_hw_list=[[64,64],[48,48],[32,32],[24,24],[16,16]]
# input_shape=[416,416,3]
# out_hw_list=[[52,52],[39,39],[26,26],[20,20],[13,13]]
input_shape=[320,320,3]
out_hw_list=[[40,40],[30,30],[20,20],[15,15],[10,10]]
# input_shape=[224,224,3]
# out_hw_list=[[28,28],[21,21],[14,14],[10,10],[7,7]]

weight_path=weights/320_nolog.hdf5

Fully Dataset

The entire MS-COCO data set is too large, here only a few pictures are stored for DEMO,

if you need complete data, please download on this page.

Our Data Format

We did not use the official format of MS-COCO, we expressed a bounding box as following:

[ left_top_x<float>, left_top_y<float>, w<float>, h<float>, confidence<float>, class<str> ]

The bounding boxes contained in a picture are represented by single json file.

For detailed format, please refer to the json file in "data/coco/train/json".

AP Performance on MS-COCO

For detailed COCO report, please refer to "mscoco_result".

TODOs

Improve the calculator script of FLOPs.
Using Focal Loss will cause overfitting, we need to explore the reasons.

A state of the art of new lightweight YOLO model implemented by TensorFlow 2.

Related tags

Overview

CSL-YOLO: A New Lightweight Object Detection System for Edge Computing

Requirements

How to Get Started?

WebCam DEMO(on CPU)

More Info

Change Model Scale

Fully Dataset

Our Data Format

AP Performance on MS-COCO

TODOs

Owner

Miles Zhang

End-to-end speech secognition toolkit

An image processing project uses Viola-jones technique to detect faces and then use SIFT algorithm for recognition.

Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval (NeurIPS'21)

Bonnet: An Open-Source Training and Deployment Framework for Semantic Segmentation in Robotics.

Official PyTorch implementation of the paper Image-Based CLIP-Guided Essence Transfer.

Sionna: An Open-Source Library for Next-Generation Physical Layer Research

A Python implementation of the Locality Preserving Matching (LPM) method for pruning outliers in image matching.

Proximal Backpropagation - a neural network training algorithm that takes implicit instead of explicit gradient steps

Artificial Intelligence playing minesweeper 🤖

An official implementation of "SFNet: Learning Object-aware Semantic Correspondence" (CVPR 2019, TPAMI 2020) in PyTorch.

Numerical differential equation solvers in JAX. Autodifferentiable and GPU-capable.

noisy labels; missing labels; semi-supervised learning; entropy; uncertainty; robustness and generalisation.

A Lightweight Hyperparameter Optimization Tool 🚀

Code for EMNLP 2021 main conference paper "Text AutoAugment: Learning Compositional Augmentation Policy for Text Classification"

Using VideoBERT to tackle video prediction

Oriented Object Detection: Oriented RepPoints + Swin Transformer/ReResNet

Viperdb - A tiny log-structured key-value database written in pure Python

A custom-designed Spider Robot trained to walk using Deep RL in a PyBullet Simulation

Federated learning on graph, especially on graph neural networks (GNNs), knowledge graph, and private GNN.

Over-the-Air Ensemble Inference with Model Privacy