NanoDet-Plus⚡Super fast and lightweight anchor-free object detection model. 🔥Only 980 KB(int8) / 1.8MB (fp16) and run 97FPS on cellphone🔥

Overview

NanoDet-Plus

Super fast and high accuracy lightweight anchor-free object detection model. Real-time on mobile devices.

CI testing Codecov GitHub license Github downloads GitHub release (latest by date)

  • Super lightweight: Model file is only 980KB(INT8) or 1.8MB(FP16).
  • Super fast: 97fps(10.23ms) on mobile ARM CPU.
  • 👍 High accuracy: Up to 34.3 mAPval@0.5:0.95 and still realtime on CPU.
  • 🤗 Training friendly: Much lower GPU memory cost than other models. Batch-size=80 is available on GTX1060 6G.
  • 😎 Easy to deploy: Support various backends including ncnn, MNN and OpenVINO. Also provide Android demo based on ncnn inference framework.

Introduction

NanoDet is a FCOS-style one-stage anchor-free object detection model which using Generalized Focal Loss as classification and regression loss.

In NanoDet-Plus, we propose a novel label assignment strategy with a simple assign guidance module (AGM) and a dynamic soft label assigner (DSLA) to solve the optimal label assignment problem in lightweight model training. We also introduce a light feature pyramid called Ghost-PAN to enhance multi-layer feature fusion. These improvements boost previous NanoDet's detection accuracy by 7 mAP on COCO dataset.

NanoDet-Plus 知乎中文介绍

NanoDet 知乎中文介绍

QQ交流群:908606542 (答案:炼丹)


Benchmarks

Model Resolution mAPval
0.5:0.95
CPU Latency
(i7-8700)
ARM Latency
(4xA76)
FLOPS Params Model Size
NanoDet-m 320*320 20.6 4.98ms 10.23ms 0.72G 0.95M 1.8MB(FP16) | 980KB(INT8)
NanoDet-Plus-m 320*320 27.0 5.25ms 11.97ms 0.9G 1.17M 2.3MB(FP16) | 1.2MB(INT8)
NanoDet-Plus-m 416*416 30.4 8.32ms 19.77ms 1.52G 1.17M 2.3MB(FP16) | 1.2MB(INT8)
NanoDet-Plus-m-1.5x 320*320 29.9 7.21ms 15.90ms 1.75G 2.44M 4.7MB(FP16) | 2.3MB(INT8)
NanoDet-Plus-m-1.5x 416*416 34.1 11.50ms 25.49ms 2.97G 2.44M 4.7MB(FP16) | 2.3MB(INT8)
YOLOv3-Tiny 416*416 16.6 - 37.6ms 5.62G 8.86M 33.7MB
YOLOv4-Tiny 416*416 21.7 - 32.81ms 6.96G 6.06M 23.0MB
YOLOX-Nano 416*416 25.8 - 23.08ms 1.08G 0.91M 1.8MB(FP16)
YOLOv5-n 640*640 28.4 - 44.39ms 4.5G 1.9M 3.8MB(FP16)
FBNetV5 320*640 30.4 - - 1.8G - -
MobileDet 320*320 25.6 - - 0.9G - -

Download pre-trained models and find more models in Model Zoo or in Release Files

Notes (click to expand)
  • ARM Performance is measured on Kirin 980(4xA76+4xA55) ARM CPU based on ncnn. You can test latency on your phone with ncnn_android_benchmark.

  • Intel CPU Performance is measured Intel Core-i7-8700 based on OpenVINO.

  • NanoDet mAP(0.5:0.95) is validated on COCO val2017 dataset with no testing time augmentation.

  • YOLOv3&YOLOv4 mAP refers from Scaled-YOLOv4: Scaling Cross Stage Partial Network.


NEWS!!!

  • [2021.12.25] NanoDet-Plus release! Adding AGM(Assign Guidance Module) & DSLA(Dynamic Soft Label Assigner) to improve 7 mAP with only a little cost.

Find more update notes in Update notes.

Demo

Android demo

android_demo

Android demo project is in demo_android_ncnn folder. Please refer to Android demo guide.

Here is a better implementation 👉 ncnn-android-nanodet

NCNN C++ demo

C++ demo based on ncnn is in demo_ncnn folder. Please refer to Cpp demo guide.

MNN demo

Inference using Alibaba's MNN framework is in demo_mnn folder. Please refer to MNN demo guide.

OpenVINO demo

Inference using OpenVINO is in demo_openvino folder. Please refer to OpenVINO demo guide.

Web browser demo

https://nihui.github.io/ncnn-webassembly-nanodet/

Pytorch demo

First, install requirements and setup NanoDet following installation guide. Then download COCO pretrain weight from here

👉 COCO pretrain checkpoint

The pre-trained weight was trained by the config config/nanodet-plus-m_416.yml.

  • Inference images
python demo/demo.py image --config CONFIG_PATH --model MODEL_PATH --path IMAGE_PATH
  • Inference video
python demo/demo.py video --config CONFIG_PATH --model MODEL_PATH --path VIDEO_PATH
  • Inference webcam
python demo/demo.py webcam --config CONFIG_PATH --model MODEL_PATH --camid YOUR_CAMERA_ID

Besides, We provide a notebook here to demonstrate how to make it work with PyTorch.


Install

Requirements

  • Linux or MacOS
  • CUDA >= 10.0
  • Python >= 3.6
  • Pytorch >= 1.7
  • experimental support Windows (Notice: Windows not support distributed training before pytorch1.7)

Step

  1. Create a conda virtual environment and then activate it.
 conda create -n nanodet python=3.8 -y
 conda activate nanodet
  1. Install pytorch
conda install pytorch torchvision cudatoolkit=11.1 -c pytorch -c conda-forge
  1. Install requirements
pip install Cython termcolor numpy tensorboard pycocotools matplotlib pyaml opencv-python tqdm pytorch-lightning torchmetrics
  1. Setup NanoDet
git clone https://github.com/RangiLyu/nanodet.git
cd nanodet
python setup.py develop

Model Zoo

NanoDet supports variety of backbones. Go to the config folder to see the sample training config files.

Model Backbone Resolution COCO mAP FLOPS Params Pre-train weight
NanoDet-m ShuffleNetV2 1.0x 320*320 20.6 0.72G 0.95M Download
NanoDet-Plus-m-320 (NEW) ShuffleNetV2 1.0x 320*320 27.0 0.9G 1.17M Weight | Checkpoint
NanoDet-Plus-m-416 (NEW) ShuffleNetV2 1.0x 416*416 30.4 1.52G 1.17M Weight | Checkpoint
NanoDet-Plus-m-1.5x-320 (NEW) ShuffleNetV2 1.5x 320*320 29.9 1.75G 2.44M Weight | Checkpoint
NanoDet-Plus-m-1.5x-416 (NEW) ShuffleNetV2 1.5x 416*416 34.1 2.97G 2.44M Weight | Checkpoint

Notice: The difference between Weight and Checkpoint is the weight only provide params in inference time, but the checkpoint contains training time params.

Legacy Model Zoo

Model Backbone Resolution COCO mAP FLOPS Params Pre-train weight
NanoDet-m-416 ShuffleNetV2 1.0x 416*416 23.5 1.2G 0.95M Download
NanoDet-m-1.5x ShuffleNetV2 1.5x 320*320 23.5 1.44G 2.08M Download
NanoDet-m-1.5x-416 ShuffleNetV2 1.5x 416*416 26.8 2.42G 2.08M Download
NanoDet-m-0.5x ShuffleNetV2 0.5x 320*320 13.5 0.3G 0.28M Download
NanoDet-t ShuffleNetV2 1.0x 320*320 21.7 0.96G 1.36M Download
NanoDet-g Custom CSP Net 416*416 22.9 4.2G 3.81M Download
NanoDet-EfficientLite EfficientNet-Lite0 320*320 24.7 1.72G 3.11M Download
NanoDet-EfficientLite EfficientNet-Lite1 416*416 30.3 4.06G 4.01M Download
NanoDet-EfficientLite EfficientNet-Lite2 512*512 32.6 7.12G 4.71M Download
NanoDet-RepVGG RepVGG-A0 416*416 27.8 11.3G 6.75M Download

How to Train

  1. Prepare dataset

    If your dataset annotations are pascal voc xml format, refer to config/nanodet_custom_xml_dataset.yml

    Or convert your dataset annotations to MS COCO format(COCO annotation format details).

  2. Prepare config file

    Copy and modify an example yml config file in config/ folder.

    Change save_path to where you want to save model.

    Change num_classes in model->arch->head.

    Change image path and annotation path in both data->train and data->val.

    Set gpu ids, num workers and batch size in device to fit your device.

    Set total_epochs, lr and lr_schedule according to your dataset and batchsize.

    If you want to modify network, data augmentation or other things, please refer to Config File Detail

  3. Start training

    NanoDet is now using pytorch lightning for training.

    For both single-GPU or multiple-GPUs, run:

    python tools/train.py CONFIG_FILE_PATH
  4. Visualize Logs

    TensorBoard logs are saved in save_dir which you set in config file.

    To visualize tensorboard logs, run:

    cd <YOUR_SAVE_DIR>
    tensorboard --logdir ./

How to Deploy

NanoDet provide multi-backend C++ demo including ncnn, OpenVINO and MNN. There is also an Android demo based on ncnn library.

Export model to ONNX

To convert NanoDet pytorch model to ncnn, you can choose this way: pytorch->onnx->ncnn

To export onnx model, run tools/export_onnx.py.

python tools/export_onnx.py --cfg_path ${CONFIG_PATH} --model_path ${PYTORCH_MODEL_PATH}

Run NanoDet in C++ with inference libraries

ncnn

Please refer to demo_ncnn.

OpenVINO

Please refer to demo_openvino.

MNN

Please refer to demo_mnn.

Run NanoDet on Android

Please refer to android_demo.


Citation

If you find this project useful in your research, please consider cite:

@misc{=nanodet,
    title={NanoDet-Plus: Super fast and high accuracy lightweight anchor-free object detection model.},
    author={RangiLyu},
    howpublished = {\url{https://github.com/RangiLyu/nanodet}},
    year={2021}
}

Thanks

https://github.com/Tencent/ncnn

https://github.com/open-mmlab/mmdetection

https://github.com/implus/GFocal

https://github.com/cmdbug/YOLOv5_NCNN

https://github.com/rbgirshick/yacs

Comments
  • 训练完10个epoch开始测试的时候报错:list object has no attribute cpu

    训练完10个epoch开始测试的时候报错:list object has no attribute cpu

    File "nanodet-main/nanodet/trainer/trainer.py", line 89, in run_epoch results[meta['img_info']['id'].cpu().numpy()[0]] = dets AttributeError: 'list' object has no attribute 'cpu'

    opened by DL-Practise 16
  • Training nanodet from scratch

    Training nanodet from scratch

    Hi, I'm training NanoDet-m model (ShuffleNetV2 1.0x | 320*320) from scratch with Coco dataset and 4 GeForce RTX 2080 Ti. Convergence seems pretty slow, it could take 1-2 weeks.

    May I ask how long did it takes for you to reach 20.6 mAP, and which setup did you use?

    Thank you.

    bug help wanted 
    opened by Cloudz333 10
  • 关于项目部署的问题

    关于项目部署的问题

    你好,我想请教两个问题:

    1. nanodet.cpp文件中的NanoDet::detect(cv::Mat image, float score_threshold, float nms_threshold)函数中,给模型输入数据的时候是用的ex.input("input.1", input);,这里的input.1是什么意思呢,是输入层的名字吗,我怎么通过pytorch查看到这个名字呢,print(model)后没看到层的名字,在Tencent/ncnn/tree/master/examples 上看到基本上都是ex.input("input", input);,如果我加载自己训练的一个模型,这里应该怎么匹配?
    2. nadodet.h中,有一个 std::vector heads_info,这个里面的值具体是什么含义呢,是和网络输出有关的吗
        std::vector<HeadInfo> heads_info{
            // cls_pred|dis_pred|stride
                {"792", "795",    8},
                {"814", "817",   16},
                {"836", "839",   32},
        };
    

    对pytorch以及nano网络都不是很熟,望见谅。

    opened by busyyang 8
  • 运行demo.py时,出现了一个小问题.

    运行demo.py时,出现了一个小问题.

    我的运行环境: cuda==10.1 pytorch==1.7 torchvision==0.8.0 当我运行"python demo/demo.py image --config CONFIG_PATH --model MODEL_PATH --path IMAGE_PATH",尝试推理图片时, 出现错误: RuntimeError: Could not run 'torchvision::nms' with arguments from the 'CUDA' backend. 'torchvision::nms' is only available for these backends: [CPU, BackendSelect, Named, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, Tracer, Autocast, Batched, VmapMode].

    CPU: registered at /root/project/torchvision/csrc/vision.cpp:59 [kernel] BackendSelect: fallthrough registered at /pytorch/aten/src/ATen/core/BackendSelectFallbackKernel.cpp:3 [backend fallback] Named: registered at /pytorch/aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback] AutogradOther: fallthrough registered at /pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:35 [backend fallback] AutogradCPU: fallthrough registered at /pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:39 [backend fallback] AutogradCUDA: fallthrough registered at /pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:43 [backend fallback] AutogradXLA: fallthrough registered at /pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:47 [backend fallback] Tracer: fallthrough registered at /pytorch/torch/csrc/jit/frontend/tracer.cpp:967 [backend fallback] Autocast: fallthrough registered at /pytorch/aten/src/ATen/autocast_mode.cpp:254 [backend fallback] Batched: registered at /pytorch/aten/src/ATen/BatchingRegistrations.cpp:511 [backend fallback] VmapMode: fallthrough registered at /pytorch/aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback]

    但是当我把:/nanodet/nanodet/model/module/nms.py batched_nms(boxes, scores, idxs, nms_cfg, class_agnostic=False)函数改后:

    boxes_for_nms = boxes_for_nms.cpu()
    scores = scores.cpu()
    boxes = boxes.cpu()
    split_thr = nms_cfg_.pop('split_thr', 10000)
    if len(boxes_for_nms) < split_thr:
        # dets, keep = nms_op(boxes_for_nms, scores, **nms_cfg_)
        keep = nms(boxes_for_nms, scores, **nms_cfg_)
        boxes = boxes[keep]
        # scores = dets[:, -1]
        scores = scores[keep]
    

    demo.py正常运行.

    opened by lidongliang666 8
  • 加入mosaic后效果变差了,是什么原因

    加入mosaic后效果变差了,是什么原因

    coco.py

    if self.load_mosaic and not isval:
                img4, labels4, bbox4 = load_mosaic(self, idx)
                meta['img_info']['height'] = img4.shape[0]
                meta['img_info']['width'] = img4.shape[1]
                meta['img'] = img4
                meta['gt_labels'] = labels4
                meta['gt_bboxes'] = bbox4
    
    
            meta = self.pipeline(self, meta, input_size)
    
            meta["img"] = torch.from_numpy(meta["img"].transpose(2, 0, 1))
            return meta
    

    在ShapeTransform里测试打印出来的bbox是正常的

    meta_data["img"] = img
            meta_data["warp_matrix"] = M
            if "gt_bboxes" in meta_data:
                boxes = meta_data["gt_bboxes"]
                meta_data["gt_bboxes"] = warp_boxes(boxes, M, dst_shape[0], dst_shape[1])
            if "gt_masks" in meta_data:
                for i, mask in enumerate(meta_data["gt_masks"]):
                    meta_data["gt_masks"][i] = cv2.warpPerspective(
                        mask, M, dsize=tuple(dst_shape)
                    )
            for i in range(meta_data["gt_bboxes"].shape[0]):
                cv2.rectangle(img, (int(meta_data["gt_bboxes"][i][0]), int(meta_data["gt_bboxes"][i][1])), (int(meta_data["gt_bboxes"][i][2]), int(meta_data["gt_bboxes"][i][3])), (255,0,0), 2)
            cv2.imwrite('./%d.jpg' % int(meta_data["gt_bboxes"][0][0]), img)
    

    有什么可能的原因导致的?

    opened by Rokuki 6
  • Cannot find blob with name: dis_pred_stride_8

    Cannot find blob with name: dis_pred_stride_8

    使用demo_ncnn和demo_openvino测试转换预训练模型,转换过程均正常,但是预测时候出现问题,想问下怎么解决?

    # demo_ncnn
    find_blob_index_by_name input.1 failed
    Try
    find_blob_index_by_name dis_pred_stride_8 failed
    Try
    find_blob_index_by_name cls_pred_stride_8 failed
    
    # demo_openvino
    start init model
    success
    terminate called after throwing an instance of 'InferenceEngine::details::InferenceEngineException'
    what(): Cannot find blob with name: dis_pred_stride_8
    

    发现onnx模型存在dis_pred_stride_8等节点,但是转换后的ncnn模型这几个节点消失 onnx网络结构: onnx ncnn网络结构: ncnn

    opened by TTMRonald 6
  • Cannot find blob with name: 795

    Cannot find blob with name: 795

    转换的是NanoDet-EfficientLite 512x512这个模型,openvino版本为2021.3.394,能够正常转换,并在程序中加载成功,但推理的时候报错,日志如下: start init model success terminate called after throwing an instance of 'InferenceEngine::details::InferenceEngineException' what(): Cannot find blob with name: 795 有人遇到过吗

    opened by deep-practice 6
  • CoreML export failure: 'ConvModule' object has no attribute 'norm'

    CoreML export failure: 'ConvModule' object has no attribute 'norm'

    Hi, I tried to turn the nanodet-m.pth to coreml for IOS. I used coremltools as the guide, and got error "CoreML export failure: 'ConvModule' object has no attribute 'norm'". I read the source code of nanodet found that the norm in head is BN which should be supported by coreml. So I do not know why is the error happening. Is anyone has tried coreml? Thanks!

    opened by ghoshaw 6
  • No result while using single-class nano model in ncnn

    No result while using single-class nano model in ncnn

    Hi,我训练了一个person类的nanodet模型,然后通过tool/export.py转为onnx,然后转为ncnn的model,但是发现ncnn的model没有输出,我更改了cpp代码中的类别与图片size,不知道是在转换onnx时候出错还是onnx->NCNN时候出错了。下面是我训练时候的cfg

    #Config File example
    save_dir: workspace/nanodet_m
    model:
      arch:
        name: GFL
        backbone:
          name: ShuffleNetV2
          model_size: 1.0x
          out_stages: [2,3,4]
          activation: LeakyReLU
        fpn:
          name: PAN
          in_channels: [116, 232, 464]
          out_channels: 96
          start_level: 0
          num_outs: 3
        head:
          name: NanoDetHead
          num_classes: 1
          input_channel: 96
          feat_channels: 96
          stacked_convs: 2
          share_cls_reg: True
          octave_base_scale: 5
          scales_per_octave: 1
          strides: [8, 16, 32]
          reg_max: 7
          norm_cfg:
            type: BN
          loss:
            loss_qfl:
              name: QualityFocalLoss
              use_sigmoid: True
              beta: 2.0
              loss_weight: 1.0
            loss_dfl:
              name: DistributionFocalLoss
              loss_weight: 0.25
            loss_bbox:
              name: GIoULoss
              loss_weight: 2.0
    data:
      train:
        name: coco
        img_path: ../data/yoga_coco/images/train2017
        ann_path: ../data/yoga_coco/annotations/instances_train2017.json
        input_size: [416,416] #[w,h]
        keep_ratio: True
        pipeline:
          perspective: 0.0
          scale: [0.6, 1.4]
          stretch: [[1, 1], [1, 1]]
          rotation: 0
          shear: 0
          translate: 0.2
          flip: 0.5
          brightness: 0.2
          contrast: [0.8, 1.2]
          saturation: [0.8, 1.2]
          normalize: [[103.53, 116.28, 123.675], [57.375, 57.12, 58.395]]
      val:
        name: coco
        img_path: ../data/yoga_coco/images/val2017
        ann_path: ../data/yoga_coco/annotations/instances_val2017.json
        input_size: [416,416] #[w,h]
        keep_ratio: True
        pipeline:
          normalize: [[103.53, 116.28, 123.675], [57.375, 57.12, 58.395]]
    device:
      gpu_ids: [0]
      workers_per_gpu: 6
      batchsize_per_gpu: 40
    schedule:
    #  resume:
    #  load_model: YOUR_MODEL_PATH
      optimizer:
        name: SGD
        lr: 0.14
        momentum: 0.9
        weight_decay: 0.0001
      warmup:
        name: linear
        steps: 300
        ratio: 0.1
      total_epochs: 50
      lr_schedule:
        name: MultiStepLR
        milestones: [130,160,175,185]
        gamma: 0.1
      val_intervals: 10
    evaluator:
      name: CocoDetectionEvaluator
      save_key: mAP
    
    log:
      interval: 10
    
    class_names: ['person',]
    

    当我使用80类的model时,转化为ncnn有结果,所以想问问 当转化成single-class时候,有什么配置是需要再修改一下的。

    opened by Sean-hku 6
  • pth转onnx转ncnn问题

    pth转onnx转ncnn问题

    您好,我想问一下,我这边用pytorch模型转onnx再转ncnn模型,最后用ncnn模型检测结果不对。 有几个修改: 将config中的val输入改为64x64,将tools/export.py的输入大小改为64x64 python tools/export.py python -m onnxsim output.onnx output-sim.onnx build/tools/onnx/onnx2ncnn output-sim.onnx output-sim.param output-sim.bin build/tools/ncnnoptimize output-sim.param output-sim.bin new-output-sim.param new-output-sim.bin 0 这样操作是这样的 pytorch用的1.7.1 onnx 1.8.0 onnx-simplifier 0.2.19 onnxoptimizer 0.1.1 onnxruntime 1.6.0

    是哪里操作有问题吗?

    opened by yhl41001 6
  • original pytorch or onnx model

    original pytorch or onnx model

    Could you please provide pretrained pytorch or onnx model weights also? I noticed you only shared converted ncnn models, but I would like to see the speed of inference on gpu/npu accelerated systems

    opened by kadirbeytorun 6
  •  python tools/train.py  config/nanodet-plus-m_320.yml

    python tools/train.py config/nanodet-plus-m_320.yml

    Tried to : python tools/train.py config/nanodet-plus-m_320.yml error: pytorch_lightning.utilities.cloud_io.get_filesystem has been deprecated in v1.8.0 and will be" [NanoDet][01-04 10:28:00]INFO:Setting up data... loading annotations into memory... Done (t=18.55s) creating index... index created! loading annotations into memory... Done (t=0.56s) creating index... index created! [NanoDet][01-04 10:28:21]INFO:Creating model... model size is 1.0x init weights... => loading pretrained model https://download.pytorch.org/models/shufflenetv2_x1-5666bf0f80.pth Finish initialize NanoDet-Plus Head. GPU available: True (cuda), used: True TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs HPU available: False, using: 0 HPUs /root/anaconda3/envs/nanodet/lib/python3.7/site-packages/torch/cuda/init.py:143: UserWarning: NVIDIA GeForce RTX 3090 with CUDA capability sm_86 is not compatible with the current PyTorch installation. The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70. If you want to use the NVIDIA GeForce RTX 3090 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

    warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name)) LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1]

    | Name | Type | Params

    0 | model | NanoDetPlus | 4.3 M 1 | avg_model | NanoDetPlus | 4.3 M

    8.7 M Trainable params 0 Non-trainable params 8.7 M Total params 34.647 Total estimated model params size (MB) [NanoDet][01-04 10:28:21]INFO:Weight Averaging is enabled /root/anaconda3/envs/nanodet/lib/python3.7/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:229: PossibleUserWarning: The dataloader, train_dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the num_workers argument(try 40 which is the number of cpus on this machine) in theDataLoader` init to improve performance. category=PossibleUserWarning, Traceback (most recent call last): File "tools/train.py", line 146, in main(args) File "tools/train.py", line 141, in main trainer.fit(task, train_dataloader, val_dataloader) File "/root/anaconda3/envs/nanodet/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 604, in fit self, self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path File "/root/anaconda3/envs/nanodet/lib/python3.7/site-packages/pytorch_lightning/trainer/call.py", line 38, in _call_and_handle_interrupt return trainer_fn(*args, **kwargs) File "/root/anaconda3/envs/nanodet/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 645, in _fit_impl self._run(model, ckpt_path=self.ckpt_path) File "/root/anaconda3/envs/nanodet/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1098, in _run results = self._run_stage() File "/root/anaconda3/envs/nanodet/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1177, in _run_stage self._run_train() File "/root/anaconda3/envs/nanodet/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1200, in _run_train self.fit_loop.run() File "/root/anaconda3/envs/nanodet/lib/python3.7/site-packages/pytorch_lightning/loops/loop.py", line 194, in run self.on_run_start(*args, **kwargs) File "/root/anaconda3/envs/nanodet/lib/python3.7/site-packages/pytorch_lightning/loops/fit_loop.py", line 206, in on_run_start self.trainer.reset_train_dataloader(self.trainer.lightning_module) File "/root/anaconda3/envs/nanodet/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1552, in reset_train_dataloader if has_len_all_ranks(self.train_dataloader, self.strategy, module) File "/root/anaconda3/envs/nanodet/lib/python3.7/site-packages/pytorch_lightning/utilities/data.py", line 110, in has_len_all_ranks if total_length == 0: RuntimeError: CUDA error: no kernel image is available for execution on the device CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

    python3.7 cuda==10.2 gpu==RT3090 UBUNTU20.04

    Thanks

    opened by molyswu 0
  • Fails to train a model on a dataset with single class.

    Fails to train a model on a dataset with single class.

    I used the converted COCO 2017 with only labeled persons. Вот мой config:

    save_dir: workspace/nanodet-plus-m_416
    model:
      weight_averager:
        name: ExpMovingAverager
        decay: 0.9998
      arch:
        name: NanoDetPlus
        detach_epoch: 10
        backbone:
          name: ShuffleNetV2
          model_size: 1.0x
          out_stages: [2,3,4]
          activation: LeakyReLU
        fpn:
          name: GhostPAN
          in_channels: [116, 232, 464]
          out_channels: 96
          kernel_size: 5
          num_extra_level: 1
          use_depthwise: True
          activation: LeakyReLU
        head:
          name: NanoDetPlusHead
          num_classes: 1
          input_channel: 96
          feat_channels: 96
          stacked_convs: 2
          kernel_size: 5
          strides: [8, 16, 32, 64]
          activation: LeakyReLU
          reg_max: 1
          norm_cfg:
            type: BN
          loss:
            loss_qfl:
              name: QualityFocalLoss
              use_sigmoid: True
              beta: 2.0
              loss_weight: 1.0
            loss_dfl:
              name: DistributionFocalLoss
              loss_weight: 0.25
            loss_bbox:
              name: GIoULoss
              loss_weight: 2.0
        # Auxiliary head, only use in training time.
        aux_head:
          name: SimpleConvHead
          num_classes: 1
          input_channel: 192
          feat_channels: 192
          stacked_convs: 4
          strides: [8, 16, 32, 64]
          activation: LeakyReLU
          reg_max: 1
    data:
      train:
        name: CocoDataset
        img_path: /home/mosminin/fiftyone/coco_person/train/data
        ann_path: /home/mosminin/fiftyone/coco_person/train/labels.json
        input_size: [416,416] #[w,h]
        keep_ratio: False
        pipeline:
          perspective: 0.0
          scale: [0.6, 1.4]
          stretch: [[0.8, 1.2], [0.8, 1.2]]
          rotation: 0
          shear: 0
          translate: 0.2
          flip: 0.5
          brightness: 0.2
          contrast: [0.6, 1.4]
          saturation: [0.5, 1.2]
          normalize: [[103.53, 116.28, 123.675], [57.375, 57.12, 58.395]]
      val:
        name: CocoDataset
        img_path: /home/mosminin/fiftyone/coco_person/validation/data
        ann_path: /home/mosminin/fiftyone/coco_person/validation/labels.json
        input_size: [416,416] #[w,h]
        keep_ratio: False
        pipeline:
          normalize: [[103.53, 116.28, 123.675], [57.375, 57.12, 58.395]]
    device:
      gpu_ids: [0]
      workers_per_gpu: 6
      batchsize_per_gpu: 16
    schedule:
    #  resume:
    #  load_model:
      optimizer:
        name: AdamW
        lr: 0.001
        weight_decay: 0.05
      warmup:
        name: linear
        steps: 500
        ratio: 0.0001
      total_epochs: 300
      lr_schedule:
        name: CosineAnnealingLR
        T_max: 300
        eta_min: 0.00005
      val_intervals: 10
    grad_clip: 35
    evaluator:
      name: CocoDetectionEvaluator
      save_key: mAP
    log:
      interval: 50
    
    class_names: ['person']
    

    I also changed the train.py to use CPU instead of GPU the errors were more understandable.

        # if cfg.device.gpu_ids == -1:
        #     logger.info("Using CPU training")
        #     accelerator, devices, strategy = "cpu", None, None
        # else:
        #     accelerator, devices, strategy = "gpu", cfg.device.gpu_ids, None
    
        accelerator, devices, strategy = "cpu", None, None # CPU training
    
    

    After running it, I get the following errors.

    (.venv) [email protected]:~/dev/nanodet$ python tools/train.py /home/mosminin/dev/nanodet/config/nanodet-plus-m_416_person.yml
    /home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/utilities/cloud_io.py:33: LightningDeprecationWarning: `pytorch_lightning.utilities.cloud_io.get_filesystem` has been deprecated in v1.8.0 and will be removed in v1.10.0. Please use `lightning_lite.utilities.cloud_io.get_filesystem` instead.
      rank_zero_deprecation(
    [NanoDet][12-18 14:05:30]INFO:Setting up data...
    loading annotations into memory...
    Done (t=4.35s)
    creating index...
    index created!
    loading annotations into memory...
    Done (t=0.16s)
    creating index...
    index created!
    [NanoDet][12-18 14:05:35]INFO:Creating model...
    model size is  1.0x
    init weights...
    => loading pretrained model https://download.pytorch.org/models/shufflenetv2_x1-5666bf0f80.pth
    Finish initialize NanoDet-Plus Head.
    GPU available: True (cuda), used: False
    TPU available: False, using: 0 TPU cores
    IPU available: False, using: 0 IPUs
    HPU available: False, using: 0 HPUs
    /home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/setup.py:175: PossibleUserWarning: GPU available but not used. Set `accelerator` and `devices` using `Trainer(accelerator='gpu', devices=1)`.
      rank_zero_warn(
    
      | Name      | Type        | Params
    ------------------------------------------
    0 | model     | NanoDetPlus | 4.1 M 
    1 | avg_model | NanoDetPlus | 4.1 M 
    ------------------------------------------
    8.2 M     Trainable params
    0         Non-trainable params
    8.2 M     Total params
    32.903    Total estimated model params size (MB)
    [NanoDet][12-18 14:05:35]INFO:Weight Averaging is enabled
    /home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3190.)
      return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
    Traceback (most recent call last):
      File "/home/mosminin/dev/nanodet/tools/train.py", line 147, in <module>
        main(args)
      File "/home/mosminin/dev/nanodet/tools/train.py", line 142, in main
        trainer.fit(task, train_dataloader, val_dataloader)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 603, in fit
        call._call_and_handle_interrupt(
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/call.py", line 38, in _call_and_handle_interrupt
        return trainer_fn(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 645, in _fit_impl
        self._run(model, ckpt_path=self.ckpt_path)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1098, in _run
        results = self._run_stage()
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1177, in _run_stage
        self._run_train()
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1200, in _run_train
        self.fit_loop.run()
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/loop.py", line 199, in run
        self.advance(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/fit_loop.py", line 267, in advance
        self._outputs = self.epoch_loop.run(self._data_fetcher)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/loop.py", line 199, in run
        self.advance(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 214, in advance
        batch_output = self.batch_loop.run(kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/loop.py", line 199, in run
        self.advance(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/batch/training_batch_loop.py", line 88, in advance
        outputs = self.optimizer_loop.run(optimizers, kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/loop.py", line 199, in run
        self.advance(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 200, in advance
        result = self._run_optimization(kwargs, self._optimizers[self.optim_progress.optimizer_position])
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 247, in _run_optimization
        self._optimizer_step(optimizer, opt_idx, kwargs.get("batch_idx", 0), closure)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 357, in _optimizer_step
        self.trainer._call_lightning_module_hook(
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1342, in _call_lightning_module_hook
        output = fn(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/nanodet/trainer/task.py", line 281, in optimizer_step
        optimizer.step(closure=optimizer_closure)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/core/optimizer.py", line 169, in step
        step_output = self._strategy.optimizer_step(self._optimizer, self._optimizer_idx, closure, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/strategies/strategy.py", line 234, in optimizer_step
        return self.precision_plugin.optimizer_step(
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 121, in optimizer_step
        return optimizer.step(closure=closure, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/torch/optim/lr_scheduler.py", line 68, in wrapper
        return wrapped(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/torch/optim/optimizer.py", line 140, in wrapper
        out = func(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
        return func(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/torch/optim/adamw.py", line 120, in step
        loss = closure()
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 107, in _wrap_closure
        closure_result = closure()
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 147, in __call__
        self._result = self.closure(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 133, in closure
        step_output = self._step_fn()
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 406, in _training_step
        training_step_output = self.trainer._call_strategy_hook("training_step", *kwargs.values())
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1480, in _call_strategy_hook
        output = fn(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/pytorch_lightning/strategies/strategy.py", line 378, in training_step
        return self.model.training_step(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/nanodet/trainer/task.py", line 78, in training_step
        preds, loss, loss_states = self.model.forward_train(batch)
      File "/home/mosminin/dev/nanodet/nanodet/model/arch/nanodet_plus.py", line 56, in forward_train
        loss, loss_states = self.head.loss(head_out, gt_meta, aux_preds=aux_head_out)
      File "/home/mosminin/dev/nanodet/nanodet/model/head/nanodet_plus_head.py", line 198, in loss
        batch_assign_res = multi_apply(
      File "/home/mosminin/dev/nanodet/nanodet/util/misc.py", line 24, in multi_apply
        return tuple(map(list, zip(*map_results)))
      File "/home/mosminin/dev/nanodet/.venv/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
        return func(*args, **kwargs)
      File "/home/mosminin/dev/nanodet/nanodet/model/head/nanodet_plus_head.py", line 314, in target_assign_single_img
        assign_result = self.assigner.assign(
      File "/home/mosminin/dev/nanodet/nanodet/model/head/assigner/dsl_assigner.py", line 86, in assign
        F.one_hot(gt_labels.to(torch.int64), pred_scores.shape[-1])
    RuntimeError: Class values must be smaller than num_classes.
    
    

    What am I doing wrong?

    opened by Octopusmode 0
  • Adapting the code to output a center x, y instead of bounding boxes (x1, y1, x2, y2)

    Adapting the code to output a center x, y instead of bounding boxes (x1, y1, x2, y2)

    Hey, I'm not too familiar with machine learning and the like, and I'm not exactly ready to spend the next 2 months (yet) learning how tensor-flow works and such, so I'm hoping someone can assist me with this.

    So far, my experience with nanodet has been great; but, manually annotating images takes a lot of time which I don't have; because I don't really need the bounding box information anyway, I assumed I'd seek for a way to only give the center of objects rather than the top left and bottom right corners.

    Help would be highly appreciated 😄

    opened by icecreamnotallowed 0
  • The onnx model(which is transfor by export_onnx.py) out put is differ from pytoch model

    The onnx model(which is transfor by export_onnx.py) out put is differ from pytoch model

    def image_preprocess(img_path): img = cv2.imread(img_path).astype("float32")/255 # mean = [103.53, 116.28, 123.675] # Image net values # std = [57.375, 57.12, 58.395] mean = [113.533554, 118.14172, 123.63607] std = [21.405144, 21.405144, 21.405144] mean = np.array(mean, dtype=np.float32).reshape(1, 1, 3) / 255 std = np.array(std, dtype=np.float32).reshape(1, 1, 3) / 255 img = (img - mean) / std img = np.transpose(img, (2, 0, 1)) img = np.expand_dims(img, axis=0) return img

    def test_onnx_model(onnx_model,img_path=None): if img_path is None: img_path = "path for img" imgdata = image_preprocess(img_path) sess = rt.InferenceSession(onnx_model) input_name = sess.get_inputs()[0].name output_detect_name = sess.get_outputs()[0].name pred_onnx0= sess.run([output_detect_name], {input_name: imgdata}) print("outputs:") print(np.array(pred_onnx0))

    opened by Genlk 0
  • Fixes a couple of issues to add fp16 training support

    Fixes a couple of issues to add fp16 training support

    There were a couple of issues when trying to use fp16 training. For one was that it was not exposed through the configuration system. The other was that the DynamicSoftLabelAssigner used binary_cross_entropy instead of binary_cross_entropy_with_logits. This changes where sigmoid is called on the predictions so that the more stable binary_cross_entropy_with_logits can be used and the Trainer can be configured to use fp16 precision.

    opened by crisp-snakey 0
Releases(v1.0.0-alpha-1)
  • v1.0.0-alpha-1(Dec 26, 2021)

    NanoDet-Plus v1.0.0-alpha

    In NanoDet-Plus, we propose a novel label assignment strategy with a simple assign guidance module (AGM) and a dynamic soft label assigner (DSLA) to solve the optimal label assignment problem in lightweight model training. We also introduce a light feature pyramid called Ghost-PAN to enhance multi-layer feature fusion. These improvements boost previous NanoDet's detection accuracy by 7 mAP on COCO dataset.

    image

    Model |Resolution| mAPval
    0.5:0.95 |CPU Latency
    (i7-8700) |ARM Latency
    (4xA76) | FLOPS | Params | Model Size :-------------:|:--------:|:-------:|:--------------------:|:--------------------:|:----------:|:---------:|:-------: NanoDet-m | 320320 | 20.6 | 4.98ms | 10.23ms | 0.72G | 0.95M | 1.8MB(FP16) | 980KB(INT8) NanoDet-Plus-m | 320320 | 27.0 | 5.25ms | 11.97ms | 0.9G | 1.17M | 2.3MB(FP16) | 1.2MB(INT8) NanoDet-Plus-m | 416416 | 30.4 | 8.32ms | 19.77ms | 1.52G | 1.17M | 2.3MB(FP16) | 1.2MB(INT8) NanoDet-Plus-m-1.5x | 320320 | 29.9 | 7.21ms | 15.90ms | 1.75G | 2.44M | 4.7MB(FP16) | 2.3MB(INT8) NanoDet-Plus-m-1.5x | 416416 | 34.1 | 11.50ms | 25.49ms | 2.97G | 2.44M | 4.7MB(FP16) | 2.3MB(INT8) YOLOv3-Tiny | 416416 | 16.6 | - | 37.6ms | 5.62G | 8.86M | 33.7MB YOLOv4-Tiny | 416416 | 21.7 | - | 32.81ms | 6.96G | 6.06M | 23.0MB YOLOX-Nano | 416416 | 25.8 | - | 23.08ms | 1.08G | 0.91M | 1.8MB(FP16) YOLOv5-n | 640640 | 28.4 | - | 44.39ms | 4.5G | 1.9M | 3.8MB(FP16) FBNetV5 | 320640 | 30.4 | - | - | 1.8G | - | - MobileDet | 320*320 | 25.6 | - | - | 0.9G | - | -

    Model checkpoints and weights

    Download in the release files.

    Source code(tar.gz)
    Source code(zip)
    nanodet-plus-m-1.5x_320.onnx(9.43 MB)
    nanodet-plus-m-1.5x_320_checkpoint.ckpt(61.63 MB)
    nanodet-plus-m-1.5x_416.onnx(9.43 MB)
    nanodet-plus-m-1.5x_416_checkpoint.ckpt(61.63 MB)
    nanodet-plus-m-1.5x_416_ncnn.zip(4.40 MB)
    nanodet-plus-m-1.5x_416_openvino.zip(4.39 MB)
    nanodet-plus-m_320.onnx(4.57 MB)
    nanodet-plus-m_320_checkpoint.ckpt(33.82 MB)
    nanodet-plus-m_416.onnx(4.57 MB)
    nanodet-plus-m_416_checkpoint.ckpt(33.82 MB)
    nanodet-plus-m_416_mnn.mnn(4.59 MB)
    nanodet-plus-m_416_ncnn.zip(2.11 MB)
    nanodet-plus-m_416_openvino.zip(2.11 MB)
  • v0.4.2(Aug 22, 2021)

    v0.4.2

    Fix some compatibility issue of NanoDet v0.4

    Fix pytorch-lightning compatibility. (#304 #309 ) Fix pytorch1.9 compatibility. (#308 ) Support not raising an error when evaluate with empty results. (#310)

    I'm doing a lot of refactoring. NanoDet v1.x is coming soon.

    Download pretrained models

    Model | Backbone |Resolution|COCO mAP| FLOPS |Params | Pre-train weight | ncnn model | ncnn-int8 | :--------------------:|:------------------:|:--------:|:------:|:-----:|:-----:|:-----:|:-----:|:-----:| NanoDet-m | ShuffleNetV2 1.0x | 320320 | 20.6 | 0.72B | 0.95M | Download | Download | Download NanoDet-m-416 | ShuffleNetV2 1.0x | 416416 | 23.5 | 1.2B | 0.95M | Download| Download | Download | NanoDet-m-1.5x | ShuffleNetV2 1.5x | 320320 | 23.5 | 1.44B | 2.08M | Download | Download | Download NanoDet-m-1.5x-416 | ShuffleNetV2 1.5x | 416416 | 26.8 | 2.42B | 2.08M | Download| Download | Download NanoDet-t | ShuffleNetV2 1.0x | 320320 | 21.7 | 0.96B | 1.36M | Download | NanoDet-g | Custom CSP Net | 416416 | 22.9 | 4.2B | 3.81M | Download| NanoDet-EfficientLite | EfficientNet-Lite0 | 320320 | 24.7 | 1.72B | 3.11M | Download| NanoDet-EfficientLite | EfficientNet-Lite1 | 416416 | 30.3 | 4.06B | 4.01M | Download | NanoDet-EfficientLite | EfficientNet-Lite2 | 512512 | 32.6 | 7.12B | 4.71M | Download | NanoDet-RepVGG | RepVGG-A0 | 416416 | 27.8 | 11.3B | 6.75M | Download |

    Source code(tar.gz)
    Source code(zip)
  • v0.4.1(Jul 17, 2021)

    v0.4.1

    This is a final release of NanoDet v0.x.

    I'm doing a lot of refactoring. NanoDet v1.x is coming soon.

    Download pretrained models

    Model | Backbone |Resolution|COCO mAP| FLOPS |Params | Pre-train weight | ncnn model | ncnn-int8 | :--------------------:|:------------------:|:--------:|:------:|:-----:|:-----:|:-----:|:-----:|:-----:| NanoDet-m | ShuffleNetV2 1.0x | 320320 | 20.6 | 0.72B | 0.95M | Download | Download | Download NanoDet-m-416 | ShuffleNetV2 1.0x | 416416 | 23.5 | 1.2B | 0.95M | Download| Download | Download | NanoDet-m-1.5x | ShuffleNetV2 1.5x | 320320 | 23.5 | 1.44B | 2.08M | Download | Download | Download NanoDet-m-1.5x-416 | ShuffleNetV2 1.5x | 416416 | 26.8 | 2.42B | 2.08M | Download| Download | Download NanoDet-t | ShuffleNetV2 1.0x | 320320 | 21.7 | 0.96B | 1.36M | Download | NanoDet-g | Custom CSP Net | 416416 | 22.9 | 4.2B | 3.81M | Download| NanoDet-EfficientLite | EfficientNet-Lite0 | 320320 | 24.7 | 1.72B | 3.11M | Download| NanoDet-EfficientLite | EfficientNet-Lite1 | 416416 | 30.3 | 4.06B | 4.01M | Download | NanoDet-EfficientLite | EfficientNet-Lite2 | 512512 | 32.6 | 7.12B | 4.71M | Download | NanoDet-RepVGG | RepVGG-A0 | 416416 | 27.8 | 11.3B | 6.75M | Download |

    Source code(tar.gz)
    Source code(zip)
  • v0.4.0(Jun 8, 2021)

    What's new in v0.4.0

    1. Fix a little bug in demo.py by BlainWu (#210)
    2. Add script to export TorchScript model by strawberrypie (#211)
    3. Use fixed output names when exporting ONNX (#218)
    4. Use scale_factor instead of fixed size in resize to support dynamic shape inference (#218)
    5. Ensure num_classes equal len(class_names) by ZHEQIUSHUI (#221)
    6. Fix a bug in mnn demo while using GPU device by AcherStyx (#234)
    7. Fix with_last_conv bug in shufflenet (#239)
    8. Support batch eval (#241)
    9. Add nanodet-m-1.5x models (#242)
    10. Update model benchmark (#246)
    11. Prevent lightning Trainer from disabling cudnn.benchmark (#249)
    12. Fix multi-GPU evaluation bug with pytorch-lightning (#254)

    Download pretrained models

    Model | Backbone |Resolution|COCO mAP| FLOPS |Params | Pre-train weight | :--------------------:|:------------------:|:--------:|:------:|:-----:|:-----:|:-----:| NanoDet-m | ShuffleNetV2 1.0x | 320320 | 20.6 | 0.72B | 0.95M | Download | NanoDet-m-416 | ShuffleNetV2 1.0x | 416416 | 23.5 | 1.2B | 0.95M | Download| NanoDet-m-1.5x | ShuffleNetV2 1.5x | 320320 | 23.5 | 1.44B | 2.08M | Download | NanoDet-m-1.5x-416 | ShuffleNetV2 1.5x | 416416 | 26.8 | 2.42B | 2.08M | Download| NanoDet-t | ShuffleNetV2 1.0x | 320320 | 21.7 | 0.96B | 1.36M | Download | NanoDet-g | Custom CSP Net | 416416 | 22.9 | 4.2B | 3.81M | Download| NanoDet-EfficientLite | EfficientNet-Lite0 | 320320 | 24.7 | 1.72B | 3.11M | Download| NanoDet-EfficientLite | EfficientNet-Lite1 | 416416 | 30.3 | 4.06B | 4.01M | Download | NanoDet-EfficientLite | EfficientNet-Lite2 | 512512 | 32.6 | 7.12B | 4.71M | Download | NanoDet-RepVGG | RepVGG-A0 | 416416 | 27.8 | 11.3B | 6.75M | Download |

    Download ncnn models below

    Source code(tar.gz)
    Source code(zip)
    ncnn-nanodet-m-1.5x-416-int8.zip(1.82 MB)
    ncnn-nanodet-m-1.5x-416.zip(3.67 MB)
    ncnn-nanodet-m-1.5x-int8.zip(1.82 MB)
    ncnn-nanodet-m-1.5x.zip(3.66 MB)
    ncnn-nanodet-m-416-int8.zip(882.58 KB)
    ncnn-nanodet-m-416.zip(1.64 MB)
    ncnn-nanodet-m-int8.zip(888.76 KB)
    ncnn-nanodet-m.zip(1.64 MB)
  • v0.3.0(Apr 11, 2021)

    What's new in v0.3.0

    1. Refactor training and testing code with pytorch-lightning.
    2. Solving ONNX inference AxisError by zshn25 (#198).

    Download pretrained models

    Model | Backbone |Resolution|COCO mAP| FLOPS |Params | Pre-train weight | :--------------------:|:------------------:|:--------:|:------:|:-----:|:-----:|:-----:| NanoDet-m | ShuffleNetV2 1.0x | 320320 | 20.6 | 0.72B | 0.95M | Download | NanoDet-m-416 | ShuffleNetV2 1.0x | 416416 | 23.5 | 1.2B | 0.95M | Download| NanoDet-t (NEW) | ShuffleNetV2 1.0x | 320320 | 21.7 | 0.96B | 1.36M | Download | NanoDet-g | Custom CSP Net | 416416 | 22.9 | 4.2B | 3.81M | Download| NanoDet-EfficientLite | EfficientNet-Lite0 | 320320 | 24.7 | 1.72B | 3.11M | Download| NanoDet-EfficientLite | EfficientNet-Lite1 | 416416 | 30.3 | 4.06B | 4.01M | Download | NanoDet-EfficientLite | EfficientNet-Lite2 | 512512 | 32.6 | 7.12B | 4.71M | Download | NanoDet-RepVGG | RepVGG-A0 | 416416 | 27.8 | 11.3B | 6.75M | Download |

    Source code(tar.gz)
    Source code(zip)
    nanodet_m_ncnn_model.zip(1.64 MB)
  • v0.2.0(Mar 29, 2021)

    What's new in v0.2.0

    1. Add pyncnn demo by caishanli (#167).
    2. Fix ncnn demo build failure without vulkan by nihui (#168).
    3. Add NanoDet-t with Transformer Attention Network (#183).
    4. Add Notebook demo by zhiqwang (#188).
    5. Add feature of saving demo inference result by wwdok (#191).
    6. Fix utf-8 decode bug (#184).
    7. Fix test bug.

    Download pretrained models

    Model | Backbone |Resolution|COCO mAP| FLOPS |Params | Pre-train weight | :--------------------:|:------------------:|:--------:|:------:|:-----:|:-----:|:-----:| NanoDet-m | ShuffleNetV2 1.0x | 320320 | 20.6 | 0.72B | 0.95M | Download | NanoDet-m-416 | ShuffleNetV2 1.0x | 416416 | 23.5 | 1.2B | 0.95M | Download| NanoDet-t (NEW) | ShuffleNetV2 1.0x | 320320 | 21.7 | 0.96B | 1.36M | Download | NanoDet-g | Custom CSP Net | 416416 | 22.9 | 4.2B | 3.81M | Download| NanoDet-EfficientLite | EfficientNet-Lite0 | 320320 | 24.7 | 1.72B | 3.11M | Download| NanoDet-EfficientLite | EfficientNet-Lite1 | 416416 | 30.3 | 4.06B | 4.01M | Download | NanoDet-EfficientLite | EfficientNet-Lite2 | 512512 | 32.6 | 7.12B | 4.71M | Download | NanoDet-RepVGG | RepVGG-A0 | 416416 | 27.8 | 11.3B | 6.75M | Download |

    Source code(tar.gz)
    Source code(zip)
  • v0.1.0(Mar 7, 2021)

    What's new in v0.1.0

    1. Support MNN python and cpp inference (#83 ).
    2. Support OpenVINO inference.
    3. Support libtorch inference experimentally.
    4. Add NanoDet-g.
    5. Add EfficientNet-Lite and Rep-VGG backbone.
    6. Add Model Zoo and provide more pre-trained model.
    7. Refactor GFL head (#154 ).

    Download pretrained models

    Model | Backbone |Resolution|COCO mAP| FLOPS |Params | Pre-train weight | :--------------------:|:------------------:|:--------:|:------:|:-----:|:-----:|:-----:| NanoDet-m | ShuffleNetV2 1.0x | 320320 | 20.6 | 0.72B | 0.95M | Download | NanoDet-m-416 | ShuffleNetV2 1.0x | 416416 | 23.5 | 1.2B | 0.95M | Download| NanoDet-g | Custom CSP Net | 416416 | 22.9 | 4.2B | 3.81M | Download| NanoDet-EfficientLite | EfficientNet-Lite0 | 320320 | 24.7 | 1.72B | 3.11M | Download| NanoDet-EfficientLite | EfficientNet-Lite1 | 416416 | 30.3 | 4.06B | 4.01M | Download | NanoDet-EfficientLite | EfficientNet-Lite2 | 512512 | 32.6 | 7.12B | 4.71M | Download | NanoDet-RepVGG | RepVGG-A0 | 416*416 | 27.8 | 11.3B | 6.75M | Download |

    Source code(tar.gz)
    Source code(zip)
  • v0.0.1(Nov 22, 2020)

Owner
Away From Keyboard
The open source code of SA-UNet: Spatial Attention U-Net for Retinal Vessel Segmentation.

SA-UNet: Spatial Attention U-Net for Retinal Vessel Segmentation(ICPR 2020) Overview This code is for the paper: Spatial Attention U-Net for Retinal V

Changlu Guo 151 Dec 28, 2022
Confidence Propagation Cluster aims to replace NMS-based methods as a better box fusion framework in 2D/3D Object detection

CP-Cluster Confidence Propagation Cluster aims to replace NMS-based methods as a better box fusion framework in 2D/3D Object detection, Instance Segme

Yichun Shen 41 Dec 08, 2022
Count GitHub Stars ⭐

Count GitHub Stars per Day ⭐ Track GitHub stars per day over a date range to measure the open-source popularity of different repositories. Requirement

Ultralytics 20 Nov 20, 2022
A Tensorflow implementation of CapsNet based on Geoffrey Hinton's paper Dynamic Routing Between Capsules

CapsNet-Tensorflow A Tensorflow implementation of CapsNet based on Geoffrey Hinton's paper Dynamic Routing Between Capsules Notes: The current version

Huadong Liao 3.8k Dec 29, 2022
《Fst Lerning of Temporl Action Proposl vi Dense Boundry Genertor》(AAAI 2020)

Update 2020.03.13: Release tensorflow-version and pytorch-version DBG complete code. 2019.11.12: Release tensorflow-version DBG inference code. 2019.1

Tencent 338 Dec 16, 2022
Official repository for Fourier model that can generate periodic signals

Conditional Generation of Periodic Signals with Fourier-Based Decoder Jiyoung Lee, Wonjae Kim, Daehoon Gwak, Edward Choi This repository provides offi

8 May 25, 2022
Towards End-to-end Video-based Eye Tracking

Towards End-to-end Video-based Eye Tracking The code accompanying our ECCV 2020 publication and dataset, EVE. Authors: Seonwook Park, Emre Aksan, Xuco

Seonwook Park 76 Dec 12, 2022
Supplemental Code for "ImpressionNet :A Multi view Approach to Predict Socio Facial Impressions"

Supplemental Code for "ImpressionNet :A Multi view Approach to Predict Socio Facial Impressions" Environment requirement This code is based on Python

Rohan Kumar Gupta 1 Dec 19, 2021
Rainbow: Combining Improvements in Deep Reinforcement Learning

Rainbow Rainbow: Combining Improvements in Deep Reinforcement Learning [1]. Results and pretrained models can be found in the releases. DQN [2] Double

Kai Arulkumaran 1.4k Dec 29, 2022
Revisiting Temporal Alignment for Video Restoration

Revisiting Temporal Alignment for Video Restoration [arXiv] Kun Zhou, Wenbo Li, Liying Lu, Xiaoguang Han, Jiangbo Lu We provide our results at Google

52 Dec 25, 2022
LAMDA: Label Matching Deep Domain Adaptation

LAMDA: Label Matching Deep Domain Adaptation This is the implementation of the paper LAMDA: Label Matching Deep Domain Adaptation which has been accep

Tuan Nguyen 9 Sep 06, 2022
Code for the paper "Location-aware Single Image Reflection Removal"

Location-aware Single Image Reflection Removal The shown images are provided by the datasets from IBCLN, ERRNet, SIR2 and the Internet images. The cod

72 Dec 08, 2022
Pytorch implementation for "Large-Scale Long-Tailed Recognition in an Open World" (CVPR 2019 ORAL)

Large-Scale Long-Tailed Recognition in an Open World [Project] [Paper] [Blog] Overview Open Long-Tailed Recognition (OLTR) is the author's re-implemen

Zhongqi Miao 761 Dec 26, 2022
Evaluating Cross-lingual Sentence Representations

XNLI: The Cross-Lingual NLI Corpus XNLI is an evaluation corpus for language transfer and cross-lingual sentence classification in 15 languages. New:

Meta Research 395 Dec 19, 2022
Modifications of the official PyTorch implementation of StyleGAN3. Let's easily generate images and videos with StyleGAN2/2-ADA/3!

Alias-Free Generative Adversarial Networks (StyleGAN3) Official PyTorch implementation of the NeurIPS 2021 paper Alias-Free Generative Adversarial Net

Diego Porres 185 Dec 24, 2022
PyTorch Implementation for Deep Metric Learning Pipelines

Easily Extendable Basic Deep Metric Learning Pipeline Karsten Roth ([email 

Karsten Roth 543 Jan 04, 2023
Learning to Map Large-scale Sparse Graphs on Memristive Crossbar

Release of AutoGMap:Learning to Map Large-scale Sparse Graphs on Memristive Crossbar For reproduction of our searched model, the Ubuntu OS is recommen

2 Aug 23, 2022
10th place solution for Google Smartphone Decimeter Challenge at kaggle.

Under refactoring 10th place solution for Google Smartphone Decimeter Challenge at kaggle. Google Smartphone Decimeter Challenge Global Navigation Sat

12 Oct 25, 2022
noisy labels; missing labels; semi-supervised learning; entropy; uncertainty; robustness and generalisation.

ProSelfLC: CVPR 2021 ProSelfLC: Progressive Self Label Correction for Training Robust Deep Neural Networks For any specific discussion or potential fu

amos_xwang 57 Dec 04, 2022
R interface to fast.ai

R interface to fastai The fastai package provides R wrappers to fastai. The fastai library simplifies training fast and accurate neural nets using mod

113 Dec 20, 2022