当前位置:网站首页>Based on holding YOLOv5 custom implementation of FacePose YOLO structure interpretation, YOLO data format conversion, YOLO process modification"
Based on holding YOLOv5 custom implementation of FacePose YOLO structure interpretation, YOLO data format conversion, YOLO process modification"
2022-08-05 03:26:00 【Burnt Bay】
导读:本篇记录如何在YOLOv5The process of implementing custom datasets and detections above.Starting from the original project data format,关注每个细节,And do the custom task again in the same format.The independent implementation migrates oneprojectto the new pit.
目录
wandb:可视化训练过程
tensorboard: Start with 'tensorboard --logdir runs/train', view at http://localhost:6006/
hyperparameters: lr0=0.01, lrf=0.2, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, kpt=0.1, cls=0.5, cls_pw=1.0, obj=1.0, obj_pw=1.0, iou_t=0.2, anchor_t=4.0, fl_gamma=0.0, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0
loggers['wandb'] = wandb_logger.wandb # train.pyVisualize weights and biases in ,An account needs to be created
wandb: (1) Create a W&B account
wandb: (2) Use an existing W&B account
wandb: (3) Don't visualize my results
wandb: Enter your choice: 1
wandb: You chose 'Create a W&B account'
wandb: Create an account here: https://wandb.ai/authorize?signup=true
wandb: Paste an API key from your profile and hit enter, or press ctrl+c to quit:
需要wandb官网注册,这里是用githubJoint registration is sufficient,and get a key
模型解析
这里介绍anchor设置,with the output of the detection head
def parse_model(d, ch): # model_dict, input_channels(3)
logger.info('\n%3s%18s%3s%10s %-40s%-30s' % ('', 'from', 'n', 'params', 'module', 'arguments'))
anchors, nc, nkpt, gd, gw = d['anchors'], d['nc'], d['nkpt'], d['depth_multiple'], d['width_multiple']
#anchor的数量,其anchors:[[19, 27, 44, 40, 38, 94], [96, 68, 86, 152, 180, 137], [140, 301, 303, 264, 238, 542], [436, 615, 739, 380, 925, 792]]
na = (len(anchors[0]) // 2) if isinstance(anchors, list) else anchors # number of anchors na = 3
#Improvements to key points in the paper,3×(1+5+2×17)=3×40
no = na * (nc + 5 + 2*nkpt) # number of outputs = anchors * (classes + 5)
The optimizer parameters and Batch Size关系
# Optimizer
nbs = 64 # nominal batch size
accumulate = max(round(nbs / total_batch_size), 1) # accumulate loss before optimizing
#No modification is required herebatch—size而修改decay,The accumulated error is re-optimized
hyp['weight_decay'] *= total_batch_size * accumulate / nbs # scale weight_decay
logger.info(f"Scaled weight_decay = {
hyp['weight_decay']}")
图像增强
# class LoadImagesAndLabels(Dataset): # for training/testing
...
#马赛克增强
self.mosaic = self.augment and not self.rect # load 4 images at a time into a mosaic (only during training)
self.mosaic_border = [-img_size // 2, -img_size // 2]
self.stride = stride
self.path = path
self.kpt_label = kpt_label
#这里针对Keypointmake improvements.
self.flip_index = [0, 2, 1, 4, 3, 6, 5, 8, 7, 10, 9, 12, 11, 14, 13, 16, 15]
COCO与YOLO格式转换
COCO原始格式
${
POSE_ROOT}
|-- data
`-- |-- coco
`-- |-- annotations
| |-- person_keypoints_train2017.json
| `-- person_keypoints_val2017.json
|-- person_detection_results
| |-- COCO_val2017_detections_AP_H_56_person.json
`-- images
|-- train2017
| |-- 000000000009.jpg
| |-- 000000000025.jpg
| |-- 000000000030.jpg
| |-- ...
`-- val2017
|-- 000000000139.jpg
|-- 000000000285.jpg
|-- 000000000632.jpg
|-- ...
也就是说KeypointsThe labels are placed on the JSON文件中.We can take out a sample and analyze itJSON数据
JSONThe message contains the name of the picture、宽高、id等信息
{
"license": 4,
"file_name": "000000252219.jpg",
"coco_url": "http://images.cocodataset.org/val2017/000000252219.jpg",
"height": 428,"width": 640,
"date_captured": "2013-11-14 22:32:02",
"flickr_url": "http://farm4.staticflickr.com/3446/3232237447_13d84bd0a1_z.jpg",
"id": 252219
}
图片展示如下:
Its manually annotated information is as follows:
{
"num_keypoints": 17,
"area": 8511.1568, "iscrowd": 0,
"keypoints": [356,198,2,358,193,2,351,194,2,364,192,2,346,194,2,375,207,2,341,211,2,388,236,2,336,238,2,392,263,2,
343,242,2,373,271,2,347,272,2,372,316,2,348,318,2,372,353,2,355,354,2],
"image_id": 252219,
"bbox": [326.28,174.56,71.24,197.25],
"category_id": 1,"id": 481918
}
我们可以发现,COCO格式中KeypointsThe annotation information of 3×num_keypoins组成,每个三元组格式为:[x,y,v],其中vfor visibility,means to:
- v=0,表示不可见,and unmarked,此时x=y=0;
- v=1,表示不可见,已标记;
- v=2,表示可见,已标记.
{
"num_keypoints": 15,
"area": 8349.28485,"iscrowd": 0,
"keypoints": [100,190,2,0,0,0,96,185,2,0,0,0,86,188,2,84,208,2,71,208,2,84,245,2,59,240,2,115,263,2,66,271,2,
64,268,2,71,264,2,59,324,2,99,322,2,18,363,2,101,377,2],
"image_id": 252219,
"bbox": [9.79,167.06,121.94,226.45],
"category_id": 1,
"id": 489768
}
bounding boxformat obeys**“xywh”**,即左上角坐标+宽+高
YOLO格式
${
POSE_ROOT}
|-- data
`-- |-- coco
`-- |-- annotations
| |-- person_keypoints_train2017.json
| `-- person_keypoints_val2017.json
|-- person_detection_results
| |-- COCO_val2017_detections_AP_H_56_person.json
`-- images
| |-- train2017
| | |-- 000000000009.jpg
| | |-- 000000000025.jpg
| | |-- ...
| `-- val2017
| |-- 000000000139.jpg
| |-- 000000000285.jpg
| |-- ...
`-- labels
| |-- train2017
| | |-- 000000000009.txt
| | |-- 000000000025.txt #Pictured herekeypoint信息,以YOLO格式展示
| | |-- ...
| `-- val2017
| |-- 000000000139.txt
| |-- 000000000285.txt #Pictured herekeypoint信息,以YOLO格式展示
| |-- ...
`-- train2017.txt #The content here is:相对路径+图片名字
`-- val2017.txt #The content here is:相对路径+图片名字
Listed here"image_id": 252219的YOLO格式信息
0 0.565469 0.638283 0.111312 0.460864 0.556250 0.462617 2.000000 0.559375 0.450935 2.000000 0.548438 0.453271 2.000000 0.568750
0.448598 2.000000 0.540625 0.453271 2.000000 0.585938 0.483645 2.000000 0.532813 0.492991 2.000000 0.606250 0.551402 2.000000
0.525000 0.556075 2.000000 0.612500 0.614486 2.000000 0.535937 0.565421 2.000000 0.582812 0.633178 2.000000 0.542188 0.635514
2.000000 0.581250 0.738318 2.000000 0.543750 0.742991 2.000000 0.581250 0.824766 2.000000 0.554688 0.827103 2.000000
0 0.110562 0.654871 0.190531 0.529089 0.156250 0.443925 2.000000 0.000000 0.000000 0.000000 0.150000 0.432243 2.000000 0.000000
0.000000 0.000000 0.134375 0.439252 2.000000 0.131250 0.485981 2.000000 0.110937 0.485981 2.000000 0.131250 0.572430 2.000000
0.092188 0.560748 2.000000 0.179688 0.614486 2.000000 0.103125 0.633178 2.000000 0.100000 0.626168 2.000000 0.110937 0.616822
2.000000 0.092188 0.757009 2.000000 0.154688 0.752336 2.000000 0.028125 0.848131 2.000000 0.157812 0.880841 2.000000
0 0.894172 0.652220 0.193219 0.504112 0.837500 0.448598 1.000000 0.840625 0.439252 2.000000 0.000000 0.000000 0.000000 0.862500
0.443925 2.000000 0.000000 0.000000 0.000000 0.887500 0.483645 2.000000 0.867188 0.485981 2.000000 0.873437 0.567757 2.000000
0.865625 0.574766 2.000000 0.846875 0.630841 2.000000 0.859375 0.647196 2.000000 0.895312 0.640187 2.000000 0.873437 0.640187
2.000000 0.920312 0.754673 2.000000 0.845313 0.752336 2.000000 0.964063 0.852804 2.000000 0.828125 0.843458 2.000000
这里,JSON2YOLOFormat conversion function reference linkJSON2YOLO,其算法如下:
img = images['%g' % x['image_id']]
h, w, f = img['height'], img['width'], img['file_name']
# The COCO box format is [top left x, top left y, width, height]
box = np.array(x['bbox'], dtype=np.float64)
box[:2] += box[2:] / 2 # xy top-left corner to center
box[[0, 2]] /= w # normalize x
box[[1, 3]] /= h # normalize y
说明YOLOThe format is center point normalized,即XYWH,需要转为 C x C y C_xC_y CxCyWH(注意,At this point all points are normalized by the width and height of the image).我们按照上述COCO原始格式,See if you can get itYOLO格式:
"height": 428,"width": 640,
"num_keypoints": 17,
"area": 8511.1568, "iscrowd": 0,
"keypoints": [356,198,2,358,193,2,351,194,2,364,192,2,346,194,2,375,207,2,341,211,2,388,236,2,336,238,2,392,263,2,
343,242,2,373,271,2,347,272,2,372,316,2,348,318,2,372,353,2,355,354,2],
"image_id": 252219,
"bbox": [326.28,174.56,71.24,197.25],
通过上述算法,可以粗略估计:
bbox:(326+71/2)/640=0.5656, (174+197/2)/428=0.6355, 71/670=0.1109, 197/428=0.460
keypoints[0]: 356/640=0.5562, 198/428=0.4626
This has to do with turn intoYOLOThe result of the format is the same
0 0.565469 0.638283 0.111312 0.460864 0.556250 0.462617 2.000000
300-W转化YOLO格式
300-W人脸数据库,是包含68A popular database of human face keypoints,Its faces come from different datasets egafw、ibug等.其文件格式如下:
-- data
|-- data_300W
|-- afw
|-- helen
|-- ibug
|-- lfpw
|-- data
`-- |-- data_300W
`-- |-- annotations
|-- afw
|-- helen
|-- ibug
|-- lfpw
`-- images
| |-- train2017
| | |-- 000000000009.jpg
| | |-- 000000000025.jpg
| | |-- ...
| `-- val2017
| |-- 000000000139.jpg
| |-- 000000000285.jpg
| |-- ...
`-- labels
| |-- train2017
| | |-- 000000000009.txt
| | |-- 000000000025.txt #Pictured herekeypoint信息,以YOLO格式展示
| | |-- ...
| `-- val2017
| |-- 000000000139.txt
| |-- 000000000285.txt #Pictured herekeypoint信息,以YOLO格式展示
| |-- ...
`-- train2017.txt #The content here is:相对路径+图片名字
`-- val2017.txt #The content here is:相对路径+图片名字
300-W格式
查看data_300W/afw/1051618982_1.jpg
Corresponding to the above picture68Personal face mark is*.pt文件,打开如下
version: 1
n_points: 68
{
482.866335 268.009351
484.241455 298.524244
487.963820 329.985842
491.613829 359.446370
503.992490 387.443021
523.666182 409.551102
543.708366 429.090358
566.283098 442.751692
……
591.348649 385.406662
580.068281 384.385348
563.609110 379.281936
552.917511 366.852392
580.508062 371.198816
592.309498 371.492218
604.011866 371.855814
634.952400 369.536292
604.011866 371.855814
592.309498 371.492218
580.508062 371.198816
}
一共68a binary pair ( x i , y i ) (x_i,y_i) (xi,yi),为方便展示,Some value pairs in the middle are omitted.而coco2yolo格式如下所示,即:
0 xywh (x, y)
| | |
| | ` - - Coordinates normalized to the width and height of the image | ` - - 归一化的bounding box,中心点坐标xywith the width and height of the boxwh
` - - iscrowd:Whether the crowded scene,0,N;1,yes.
300-W格式转YOLO格式
也就是说,需要将上述68The data of face key points are transformed into coco2yolo格式.这里,我们参考PIPNetthe preprocessed text,将300WFolders are fully converted to COCOsimilar file format,Include the file target format.This is done to avoid as much as possibleyolo中代码修改.
至此,This format was converted successfully.
工程修改(Pit recording)
YOLOThere are quite a few changes involved,主要在几个方面:
- 数据集读取;
- Detection head modification;
去修改launch文件相关配置;
去修改data/coco_kepts.yamlThe data read path in the file.
去修改models/hub/cfg文件,如yolo5s6_kpts.yamlThe relevant parameters in the :nkpt 从17change68;
去修改dataset第497行,有关如何读取txt数据的;
去修改dataset第987行,about how the data changes;
修改dataset第365行,有关如何flip数据;
修改loss函数第187,和202行,有关loss_gain;
loss函数中第119行,有关sigmas是直接写死的,都写成1算了;
plots函数中第76、84行,有关plot的问题,Not done yet,Forget drawing;
修改yolo函数第90行,有关self.inplace
train log
autoanchor: Analyzing anchors... anchors/target = 7.86, Best Possible Recall (BPR) = 1.0000
Image sizes 640 train, 640 test
Using 4 dataloader workers
Logging results to runs/train/exp10
Starting training for 300 epochs...
Epoch gpu_mem box obj cls kpt kptv total labels img_size
0/299 4.22G 0.07731 0.0573 0 0.3465 0.01299 0.4941 10 640: 100%| 787/787 [02:58<00:00, 4.41it/s]
Class Images Labels P R [email protected] [email protected]:.95: 100%| 87/87 [00:14<00:00, 6.05it/s]
all 689 689 0.0073 0.691 0.00784 0.00137
……
一个epoch需要3mins,共300个epoch; Looking forward to the results!
待续
The project finally passed the debugging!More details will be released gradually.
边栏推荐
- Detailed and comprehensive postman interface testing practical tutorial
- sql server installation prompts that the username does not exist
- Developing Hololens encountered The type or namespace name 'HandMeshVertex' could not be found..
- 包拉链不可用,但是是被另一个包。
- Cybersecurity and the Metaverse: Identifying Weak Links
- 【Daily Training】1403. Minimum Subsequence in Non-Increasing Order
- 十五. 实战——mysql建库建表 字符集 和 排序规则
- Details such as compiling pretreatment
- Web3.0 Dapps - the road to the future financial world
- Increasing leetcode - a daily topic 1403. The order of the boy sequence (greed)
猜你喜欢
![[Solved] Unity Coroutine coroutine is not executed effectively](/img/ab/035ef004a561fb98d3dd1d7d8b5618.png)
[Solved] Unity Coroutine coroutine is not executed effectively

冒泡排序与快速排序

Everyone in China said data, you need to focus on core characteristic is what?

IJCAI2022 | DictBert: Pre-trained Language Models with Contrastive Learning for Dictionary Description Knowledge Augmentation

21 Days Learning Challenge (2) Use of Graphical Device Trees

Kubernetes 网络入门

Details such as compiling pretreatment

通过模拟Vite一起深入其工作原理

How to Add Category-Specific Widgets in WordPress

Countdown to 2 days|Cloud native Meetup Guangzhou Station, waiting for you!
随机推荐
Cybersecurity and the Metaverse: Identifying Weak Links
(11) Metaclass
2022高处安装、维护、拆除考试题模拟考试题库及在线模拟考试
rpc-remote procedure call demo
Summary of domestic environments supported by SuperMap
21 Days Learning Challenge (2) Use of Graphical Device Trees
如何在WordPress中添加特定类别的小工具
Bubble Sort and Quick Sort
运维监控系统之Open-Falcon
Tencent Cloud [Hiflow] New Era Automation Tool
sql server 安装提示用户名不存在
Distributed systems revisited: there will never be a perfect consistency scheme...
XMjs cross-domain problem solving
presto启动成功后出现2022-08-04T17:50:58.296+0800 ERROR Announcer-3 io.airlift.discovery.client.Announcer
How to find all fields with empty data in sql
Talking about data security governance and privacy computing
Everyone in China said data, you need to focus on core characteristic is what?
CPDA|How Operators Learn Data Analysis (SQL) from Negative Foundations
从“能用”到“好用” 国产软件自主可控持续推进
新人如何入门和学习软件测试?