当前位置:网站首页>[target detection] small script: extract training set images and labels and update the index
[target detection] small script: extract training set images and labels and update the index
2022-08-10 13:21:00 【zstar-_】
问题场景
在做目标检测任务时,I would like to extract images of the training set for external data augmentation alone.因此,need to be divided according totrain.txt
to extract training set images and labels.
需求实现
我使用VOC数据集进行测试,实现比较简单.
import shutil
if __name__ == '__main__':
img_src = r"D:\Dataset\VOC2007\images"
xml_src = r"D:\Dataset\VOC2007\Annotations"
img_out = "image_out/"
xml_out = "xml_out/"
txt_path = r"D:\Dataset\VOC2007\ImageSets\Segmentation\train.txt"
# 读取txt文件
with open(txt_path, 'r') as f:
line_list = f.readlines()
for line in line_list:
line_new = line.replace('\n', '') # 将换行符替换为空('')
shutil.copy(img_src + '/' + line_new + ".jpg", img_out)
shutil.copy(xml_src + '/' + line_new + ".xml", xml_out)
效果:
Update the training set index
使用数据增强之后,Throw the generated image and label thereVOC里面,混在一起.
Then write a script,Add the generated image name to train.txt
文件中.
import os
if __name__ == '__main__':
xml_src = r"C:\Users\xy\Desktop\read_train\xml_out_af"
txt_path = r"D:\Dataset\VOC2007\ImageSets\Segmentation\train.txt"
for name in os.listdir(xml_src):
with open(txt_path, 'a') as f:
f.write(name[:-4] + "\n")
效果:
最后,before running againVOCWritten in the blog postxml2txt脚本:
import xml.etree.ElementTree as ET
import pickle
import os
from os import listdir, getcwd
from os.path import join
sets = ['train', 'test', 'val']
Imgpath = r'D:\Dataset\VOC2007\images' # 图片文件夹
xmlfilepath = r'D:\Dataset\VOC2007\Annotations' # xml文件存放地址
ImageSets_path = r'D:\Dataset\VOC2007\ImageSets\Segmentation'
Label_path = r'D:\Dataset\VOC2007'
classes = ['aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car', 'cat', 'chair', 'cow', 'diningtable', 'dog',
'horse', 'motorbike', 'person', 'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor']
def convert(size, box):
dw = 1. / size[0]
dh = 1. / size[1]
x = (box[0] + box[1]) / 2.0
y = (box[2] + box[3]) / 2.0
w = box[1] - box[0]
h = box[3] - box[2]
x = x * dw
w = w * dw
y = y * dh
h = h * dh
return (x, y, w, h)
def convert_annotation(image_id):
in_file = open(xmlfilepath + '/%s.xml' % (image_id))
out_file = open(Label_path + '/labels/%s.txt' % (image_id), 'w')
tree = ET.parse(in_file)
root = tree.getroot()
size = root.find('size')
w = int(size.find('width').text)
h = int(size.find('height').text)
for obj in root.iter('object'):
difficult = obj.find('difficult').text
cls = obj.find('name').text
if cls not in classes or int(difficult) == 1:
continue
cls_id = classes.index(cls)
xmlbox = obj.find('bndbox')
b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text),
float(xmlbox.find('ymax').text))
bb = convert((w, h), b)
out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')
for image_set in sets:
if not os.path.exists(Label_path + 'labels/'):
os.makedirs(Label_path + 'labels/')
image_ids = open(ImageSets_path + '/%s.txt' % (image_set)).read().strip().split()
list_file = open(Label_path + '%s.txt' % (image_set), 'w')
for image_id in image_ids:
# print(image_id) # DJI_0013_00360
list_file.write(Imgpath + '/%s.jpg\n' % (image_id))
convert_annotation(image_id)
list_file.close()
运行之后,You can see that the generated data augmentation samples are perfectly added to the original dataset.
边栏推荐
- 娄底疾控中心实验室设计理念说明
- ArcMAP has a problem of -15 and cannot be accessed [Provide your license server administrator with the following information:Err-15]
- Calculate the number of combinations recursively
- Short read or OOM loading DB. Unrecoverable error, aborting now
- How to describe multiple paragraphs with different font settings in Open Office XML format
- 教育Codeforces轮41(额定Div。2)大肠Tufurama
- BEVDet4D: Exploit Temporal Cues in Multi-camera 3D Object Detection 论文笔记
- 【ECCV 2022|Millions of Prizes】PSG Competition: Pursuing the "Most Comprehensive" Scene Understanding
- Fragment-hide and show
- 【mysql索引实现原理】
猜你喜欢
Jiugongge lottery animation
Keithley DMM7510精准测量超低功耗设备各种运作模式功耗
R语言实战应用案例:论文篇(一)-特殊柱形图绘制
【ECCV 2022|Millions of Prizes】PSG Competition: Pursuing the "Most Comprehensive" Scene Understanding
MySQL面试题整理
Solution for "Certificate not valid for requested usage" after Digicert EV certificate signing
【黑马早报】雷军称低谷期曾想转行开酒吧;拜登正式签署芯片法案;软银二季度巨亏230亿美元;北京市消协约谈每日优鲜...
鸿蒙开发从hello world开始
LeetCode·每日一题·640.求解方程·模拟构造
ABAP file operations involved in the Chinese character set of problems and solutions for trying to read
随机推荐
How to describe multiple paragraphs with different font settings in Open Office XML format
BEVDet4D: Exploit Temporal Cues in Multi-camera 3D Object Detection Paper Notes
教育Codeforces轮41(额定Div。2)大肠Tufurama
sprintboot项目通过interceptor和filter实现接入授权控制
商汤自研机械臂,首款产品是AI下棋机器人:还请郭晶晶作代言
Comparison version number of middle questions in LeetCode
[Advanced Digital IC Verification] Difference and focus analysis between SoC system verification and IP module verification
jenkins数据迁移和备份
M²BEV: Multi-Camera Joint 3D Detection and Segmentation with Unified Bird’s-Eye View Representation
CodeForces-834C
The basic components of Loudi plant cell laboratory construction
22!Beijing Changping District notified catering service enterprises with food safety problems
代码随想录笔记_动态规划_70爬楼梯
Codeforces Round #276 (Div. 1) B. Maximum Value
【ECCV 2022|百万奖金】PSG大赛:追求“最全面”的场景理解
Fragment's show and hide
娄底妆品实验室建设规划构思
娄底植物细胞实验室建设基本组成要点
想通这点,治好 AI 打工人的精神内耗
C# InitializeComponent() does not exist in the current context