I tried to apply the CAM algorithm to YOLOv4 and it worked.

Last update: Dec 05, 2022

Related tags

Deep Learning YOLOv4_CAM

Overview

YOLOV4：You Only Look Once目标检测模型在pytorch当中的实现

2021年2月7日更新：
加入letterbox_image的选项，关闭letterbox_image后网络的map得到大幅度提升。

性能情况

训练数据集	权值文件名称	测试数据集	输入图片大小	mAP 0.5:0.95	mAP 0.5
VOC07+12+COCO	yolo4_voc_weights.pth	VOC-Test07	416x416	-	89.0
COCO-Train2017	yolo4_weights.pth	COCO-Val2017	416x416	46.1	70.2

实现的内容

主干特征提取网络：DarkNet53 => CSPDarkNet53
特征金字塔：SPP，PAN
训练用到的小技巧：Mosaic数据增强、Label Smoothing平滑、CIOU、学习率余弦退火衰减
激活函数：使用Mish激活函数
……balabla

所需环境

torch==1.2.0

注意事项

代码中的yolo4_weights.pth是基于608x608的图片训练的，但是由于显存原因。我将代码中的图片大小修改成了416x416。有需要的可以修改回来。代码中的默认anchors是基于608x608的图片的。
注意不要使用中文标签，文件夹中不要有空格！
在训练前需要务必在model_data下新建一个txt文档，文档中输入需要分的类，在train.py中将classes_path指向该文件。

小技巧的设置

在train.py文件下：
1、mosaic参数可用于控制是否实现Mosaic数据增强。
2、Cosine_scheduler可用于控制是否使用学习率余弦退火衰减。
3、label_smoothing可用于控制是否Label Smoothing平滑。

文件下载

训练所需的yolo4_weights.pth可在百度网盘中下载。
链接: https://pan.baidu.com/s/1WlDNPtGO1pwQbqwKx1gRZA 提取码: p4sc
yolo4_weights.pth是coco数据集的权重。
yolo4_voc_weights.pth是voc数据集的权重。

预测步骤

a、使用预训练权重

下载完库后解压，在百度网盘下载yolo4_weights.pth或者yolo4_voc_weights.pth，放入model_data，运行predict.py，输入

img/street.jpg

利用video.py可进行摄像头检测。

b、使用自己训练的权重

按照训练步骤训练。
在yolo.py文件里面，在如下部分修改model_path和classes_path使其对应训练好的文件；model_path对应logs文件夹下面的权值文件，classes_path是model_path对应分的类。

_defaults = {
    "model_path": 'model_data/yolo4_weights.pth',
    "anchors_path": 'model_data/yolo_anchors.txt',
    "classes_path": 'model_data/coco_classes.txt',
    "model_image_size" : (416, 416, 3),
    "confidence": 0.5,
    "cuda": True
}

运行predict.py，输入

img/street.jpg

利用video.py可进行摄像头检测。

训练步骤

本文使用VOC格式进行训练。
训练前将标签文件放在VOCdevkit文件夹下的VOC2007文件夹下的Annotation中。
训练前将图片文件放在VOCdevkit文件夹下的VOC2007文件夹下的JPEGImages中。
在训练前利用voc2yolo4.py文件生成对应的txt。
再运行根目录下的voc_annotation.py，运行前需要将classes改成你自己的classes。注意不要使用中文标签，文件夹中不要有空格！

classes = ["aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train", "tvmonitor"]

此时会生成对应的2007_train.txt，每一行对应其图片位置及其真实框的位置。
在训练前需要务必在model_data下新建一个txt文档，文档中输入需要分的类，在train.py中将classes_path指向该文件，示例如下：

classes_path = 'model_data/new_classes.txt'

model_data/new_classes.txt文件内容为：

cat
dog
...

运行train.py即可开始训练。

mAP目标检测精度计算更新

更新了get_gt_txt.py、get_dr_txt.py和get_map.py文件。
get_map文件克隆自https://github.com/Cartucho/mAP
具体mAP计算过程可参考：https://www.bilibili.com/video/BV1zE411u7Vw

Reference

https://github.com/qqwweee/keras-yolo3/
https://github.com/Cartucho/mAP
https://github.com/Ma-Dan/keras-yolo4

The above is original readme.md

My work

I tried to train the YOLOv4 to detect the helmet and it's color. In order to know whether it learned well, I visualized the output of the YOLO-Head.

origin.jpg	detection.jpg

head0	head1	head2

The above are shown the visualization of the "yellow", we can easily see the hot area focus on the yellow helmets. So I think this can help us to train the model to a certain extent.

I tried to apply the CAM algorithm to YOLOv4 and it worked.

Related tags

Overview

YOLOV4：You Only Look Once目标检测模型在pytorch当中的实现

目录

性能情况

实现的内容

所需环境

注意事项

小技巧的设置

文件下载

预测步骤

a、使用预训练权重

b、使用自己训练的权重

训练步骤

mAP目标检测精度计算更新

Reference

The above is original readme.md

My work

Owner

Open source annotation tool for machine learning practitioners.

Freecodecamp Scientific Computing with Python Certification; Solution for Challenge 2: Time Calculator

An excellent hash algorithm combining classical sponge structure and RNN.

Object Depth via Motion and Detection Dataset

This repository contains several jupyter notebooks to help users learn to use neon, our deep learning framework

License Plate Detection Application

RoIAlign & crop_and_resize for PyTorch

Learn other languages using artificial intelligence with python.

Do Neural Networks for Segmentation Understand Insideness?

Evolution Strategies in PyTorch

Activity tragle - Google is tracking everything, we just look at it

Soft actor-critic is a deep reinforcement learning framework for training maximum entropy policies in continuous domains.

Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation

PyTorch code for the paper "Complementarity is the King: Multi-modal and Multi-grained Hierarchical Semantic Enhancement Network for Cross-modal Retrieval".

A library for finding knowledge neurons in pretrained transformer models.

Official code for: A Probabilistic Hard Attention Model For Sequentially Observed Scenes

Code for unmixing audio signals in four different stems "drums, bass, vocals, others". The code is adapted from "Jukebox: A Generative Model for Music"

Implementation of Graph Transformer in Pytorch, for potential use in replicating Alphafold2

Unofficial implementation of the Involution operation from CVPR 2021

VideoGPT: Video Generation using VQ-VAE and Transformers

I tried to apply the CAM algorithm to YOLOv4 and it worked.

Related tags

Overview

YOLOV4：You Only Look Once目标检测模型在pytorch当中的实现

目录

性能情况

实现的内容

所需环境

注意事项

小技巧的设置

文件下载

预测步骤

a、使用预训练权重

b、使用自己训练的权重

训练步骤

mAP目标检测精度计算更新

Reference

The above is original readme.md

My work

Owner

Open source annotation tool for machine learning practitioners.

Freecodecamp Scientific Computing with Python Certification; Solution for Challenge 2: Time Calculator

An excellent hash algorithm combining classical sponge structure and RNN.

Object Depth via Motion and Detection Dataset

This repository contains several jupyter notebooks to help users learn to use neon, our deep learning framework

License Plate Detection Application

RoIAlign & crop_and_resize for PyTorch

Learn other languages ​​using artificial intelligence with python.

Do Neural Networks for Segmentation Understand Insideness?

Evolution Strategies in PyTorch

Activity tragle - Google is tracking everything, we just look at it

Soft actor-critic is a deep reinforcement learning framework for training maximum entropy policies in continuous domains.

Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation

PyTorch code for the paper "Complementarity is the King: Multi-modal and Multi-grained Hierarchical Semantic Enhancement Network for Cross-modal Retrieval".

A library for finding knowledge neurons in pretrained transformer models.

Official code for: A Probabilistic Hard Attention Model For Sequentially Observed Scenes

Code for unmixing audio signals in four different stems "drums, bass, vocals, others". The code is adapted from "Jukebox: A Generative Model for Music"

Implementation of Graph Transformer in Pytorch, for potential use in replicating Alphafold2

Unofficial implementation of the Involution operation from CVPR 2021

VideoGPT: Video Generation using VQ-VAE and Transformers

Learn other languages using artificial intelligence with python.