Yolact-keras实例分割模型在keras当中的实现

Overview

Yolact-keras实例分割模型在keras当中的实现


目录

  1. 性能情况 Performance
  2. 所需环境 Environment
  3. 文件下载 Download
  4. 训练步骤 How2train
  5. 预测步骤 How2predict
  6. 评估步骤 How2eval
  7. 参考资料 Reference

性能情况

训练数据集 权值文件名称 测试数据集 输入图片大小 bbox mAP 0.5:0.95 bbox mAP 0.5 segm mAP 0.5:0.95 segm mAP 0.5
COCO-Train2017 yolact_weights_coco.h5 COCO-Val2017 544x544 30.3 51.8 27.1 47.2

所需环境

keras==2.1.5
tensorflow-gpu==1.13.2

文件下载

训练所需的预训练权值可在百度网盘中下载。
链接: https://pan.baidu.com/s/1OIxe9w2t5nImstDEpjncnQ
提取码: eik3

shapes数据集下载地址如下,该数据集是使用labelme标注的结果,尚未经过其它处理,用于区分三角形和正方形:
链接: https://pan.baidu.com/s/1hrCaEYbnSGBOhjoiOKQmig
提取码: jk44

训练步骤

a、训练shapes形状数据集

  1. 数据集的准备
    文件下载部分,通过百度网盘下载数据集,下载完成后解压,将图片和对应的json文件放入根目录下的datasets/before文件夹。

  2. 数据集的处理
    打开coco_annotation.py,里面的参数默认用于处理shapes形状数据集,直接运行可以在datasets/coco文件夹里生成图片文件和标签文件,并且完成了训练集和测试集的划分。

  3. 开始网络训练
    train.py的默认参数用于训练shapes数据集,默认指向了根目录下的数据集文件夹,直接运行train.py即可开始训练。

  4. 训练结果预测
    训练结果预测需要用到两个文件,分别是yolact.py和predict.py。 首先需要去yolact.py里面修改model_path以及classes_path,这两个参数必须要修改。
    model_path指向训练好的权值文件,在logs文件夹里。
    classes_path指向检测类别所对应的txt。

    完成修改后就可以运行predict.py进行检测了。运行后输入图片路径即可检测。

b、训练自己的数据集

  1. 数据集的准备
    本文使用labelme工具进行标注,标注好的文件有图片文件和json文件,二者均放在before文件夹里,具体格式可参考shapes数据集。
    在标注目标时需要注意,同一种类的不同目标需要使用 _ 来隔开。
    比如想要训练网络检测三角形和正方形,当一幅图片存在两个三角形时,分别标记为:
triangle_1
triangle_2
  1. 数据集的处理
    修改coco_annotation.py里面的参数。第一次训练可以仅修改classes_path,classes_path用于指向检测类别所对应的txt。
    训练自己的数据集时,可以自己建立一个cls_classes.txt,里面写自己所需要区分的类别。
    model_data/cls_classes.txt文件内容为:
cat
dog
...

修改coco_annotation.py中的classes_path,使其对应cls_classes.txt,并运行coco_annotation.py。

  1. 开始网络训练
    训练的参数较多,均在train.py中,大家可以在下载库后仔细看注释,其中最重要的部分依然是train.py里的classes_path。
    classes_path用于指向检测类别所对应的txt,这个txt和coco_annotation.py里面的txt一样!训练自己的数据集必须要修改!
    修改完classes_path后就可以运行train.py开始训练了,在训练多个epoch后,权值会生成在logs文件夹中。

  2. 训练结果预测
    训练结果预测需要用到两个文件,分别是yolact.py和predict.py。 首先需要去yolact.py里面修改model_path以及classes_path,这两个参数必须要修改。
    model_path指向训练好的权值文件,在logs文件夹里。
    classes_path指向检测类别所对应的txt。

    完成修改后就可以运行predict.py进行检测了。运行后输入图片路径即可检测。

c、训练coco数据集

  1. 数据集的准备
    coco训练集 http://images.cocodataset.org/zips/train2017.zip
    coco验证集 http://images.cocodataset.org/zips/val2017.zip
    coco训练集和验证集的标签 http://images.cocodataset.org/annotations/annotations_trainval2017.zip

  2. 开始网络训练
    解压训练集、验证集及其标签后。打开train.py文件,修改其中的classes_path指向model_data/coco_classes.txt。
    修改train_image_path为训练图片的路径,train_annotation_path为训练图片的标签文件,val_image_path为验证图片的路径,val_annotation_path为验证图片的标签文件。

  3. 训练结果预测
    训练结果预测需要用到两个文件,分别是yolact.py和predict.py。 首先需要去yolact.py里面修改model_path以及classes_path,这两个参数必须要修改。
    model_path指向训练好的权值文件,在logs文件夹里。
    classes_path指向检测类别所对应的txt。

    完成修改后就可以运行predict.py进行检测了。运行后输入图片路径即可检测。

预测步骤

a、使用预训练权重

  1. 下载完库后解压,在百度网盘下载权值,放入model_data,运行predict.py,输入
img/street.jpg
  1. 在predict.py里面进行设置可以进行fps测试和video视频检测。

b、使用自己训练的权重

  1. 按照训练步骤训练。
  2. 在yolact.py文件里面,在如下部分修改model_path和classes_path使其对应训练好的文件;model_path对应logs文件夹下面的权值文件,classes_path是model_path对应分的类
_defaults = {
    #--------------------------------------------------------------------------#
    #   使用自己训练好的模型进行预测一定要修改model_path和classes_path!
    #   model_path指向logs文件夹下的权值文件,classes_path指向model_data下的txt
    #
    #   训练好后logs文件夹下存在多个权值文件,选择验证集损失较低的即可。
    #   验证集损失较低不代表mAP较高,仅代表该权值在验证集上泛化性能较好。
    #   如果出现shape不匹配,同时要注意训练时的model_path和classes_path参数的修改
    #--------------------------------------------------------------------------#
    "model_path"        : 'model_data/yolact_weights_shape.h5',
    "classes_path"      : 'model_data/shape_classes.txt',
    #---------------------------------------------------------------------#
    #   输入图片的大小
    #---------------------------------------------------------------------#
    "input_shape"       : [544, 544],
    #---------------------------------------------------------------------#
    #   只有得分大于置信度的预测框会被保留下来
    #---------------------------------------------------------------------#
    "confidence"        : 0.5,
    #---------------------------------------------------------------------#
    #   非极大抑制所用到的nms_iou大小
    #---------------------------------------------------------------------#
    "nms_iou"           : 0.3,
    #---------------------------------------------------------------------#
    #   先验框的大小
    #---------------------------------------------------------------------#
    "anchors_size"      : [24, 48, 96, 192, 384],
    #---------------------------------------------------------------------#
    #   传统非极大抑制
    #---------------------------------------------------------------------#
    "traditional_nms"   : True
}
  1. 运行predict.py,输入
img/street.jpg
  1. 在predict.py里面进行设置可以进行fps测试和video视频检测。

评估步骤

a、评估自己的数据集

  1. 本文使用coco格式进行评估。
  2. 如果在训练前已经运行过coco_annotation.py文件,代码会自动将数据集划分成训练集、验证集和测试集。
  3. 如果想要修改测试集的比例,可以修改coco_annotation.py文件下的trainval_percent。trainval_percent用于指定(训练集+验证集)与测试集的比例,默认情况下 (训练集+验证集):测试集 = 9:1。train_percent用于指定(训练集+验证集)中训练集与验证集的比例,默认情况下 训练集:验证集 = 9:1。
  4. 在yolact.py里面修改model_path以及classes_path。model_path指向训练好的权值文件,在logs文件夹里。classes_path指向检测类别所对应的txt。
  5. 前往eval.py文件修改classes_path,classes_path用于指向检测类别所对应的txt,这个txt和训练时的txt一样。评估自己的数据集必须要修改。运行eval.py即可获得评估结果。

b、评估coco的数据集

  1. 下载好coco数据集。
  2. 在yolact.py里面修改model_path以及classes_path。model_path指向coco数据集的权重,在logs文件夹里。classes_path指向model_data/coco_classes.txt。
  3. 前往eval.py设置classes_path,指向model_data/coco_classes.txt。修改Image_dir为评估图片的路径,Json_path为评估图片的标签文件。 运行eval.py即可获得评估结果。

Reference

https://github.com/feiyuhuahuo/Yolact_minimal

You might also like...
Comments
  • 关于数据增强的问题

    关于数据增强的问题

    您好,B导,我看到你的这段程序中augmentation.py中有关于数据增强的代码。且在train.py中的train_dataloader = COCODetection(train_image_path, train_coco, num_classes, anchors, batch_size, COCO_LABEL_MAP, Augmentation(input_shape)) val_dataloader = COCODetection(val_image_path, train_coco, num_classes, anchors, batch_size, COCO_LABEL_MAP, Augmentation(input_shape)).这两段代码也调用了增强,但是为什么在训练时控制台输出的日志却不是增强之后的数据集规模呢?如图所示: 微信图片_20221201141209

    opened by PengboLi1998 1
Owner
Bubbliiiing
Bubbliiiing
Codebase for ECCV18 "The Sound of Pixels"

Sound-of-Pixels Codebase for ECCV18 "The Sound of Pixels". *This repository is under construction, but the core parts are already there. Environment T

Hang Zhao 318 Dec 20, 2022
[EMNLP 2021] Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training

RoSTER The source code used for Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training, p

Yu Meng 60 Dec 30, 2022
DrWhy is the collection of tools for eXplainable AI (XAI). It's based on shared principles and simple grammar for exploration, explanation and visualisation of predictive models.

Responsible Machine Learning With Great Power Comes Great Responsibility. Voltaire (well, maybe) How to develop machine learning models in a responsib

Model Oriented 590 Dec 26, 2022
Semantic similarity computation with different state-of-the-art metrics

Semantic similarity computation with different state-of-the-art metrics Description • Installation • Usage • License Description TaxoSS is a semantic

6 Jun 22, 2022
python debugger and anti-vm that checks if you're in a virtual machine or if someones trying to debug your file

Anti-Debug was made by Love ❌ code ✅ 🎉 ・What it checks for ・ Kills tools that can be used to debug your file ・ Exits if ran in vm (supports different

Rdimo 31 Aug 09, 2022
Self-driving car env with PPO algorithm from stable baseline3

Self-driving car with RL stable baseline3 Most of the project develop from https://github.com/GerardMaggiolino/Gym-Medium-Post Please check it out! Th

Sornsiri.P 7 Dec 22, 2022
A Model for Natural Language Attack on Text Classification and Inference

TextFooler A Model for Natural Language Attack on Text Classification and Inference This is the source code for the paper: Jin, Di, et al. "Is BERT Re

Di Jin 418 Dec 16, 2022
Variational autoencoder for anime face reconstruction

VAE animeface Variational autoencoder for anime face reconstruction Introduction This repository is an exploratory example to train a variational auto

Minzhe Zhang 2 Dec 11, 2021
An Open Source Machine Learning Framework for Everyone

Documentation TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries, a

170.1k Jan 05, 2023
CHERRY is a python library for predicting the interactions between viral and prokaryotic genomes

CHERRY is a python library for predicting the interactions between viral and prokaryotic genomes. CHERRY is based on a deep learning model, which consists of a graph convolutional encoder and a link

Kenneth Shang 12 Dec 15, 2022
Python tools for 3D face: 3DMM, Mesh processing(transform, camera, light, render), 3D face representations.

face3d: Python tools for processing 3D face Introduction This project implements some basic functions related to 3D faces. You can use this to process

Yao Feng 2.3k Dec 30, 2022
Train emoji embeddings based on emoji descriptions.

emoji2vec This is my attempt to train, visualize and evaluate emoji embeddings as presented by Ben Eisner, Tim Rocktäschel, Isabelle Augenstein, Matko

Miruna Pislar 17 Sep 03, 2022
PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models

PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models Code accompanying CVPR'20 paper of the same title. Paper lin

Alex Damian 7k Dec 30, 2022
NudeNet: Neural Nets for Nudity Classification, Detection and selective censoring

NudeNet: Neural Nets for Nudity Classification, Detection and selective censoring Uncensored version of the following image can be found at https://i.

notAI.tech 1.1k Dec 29, 2022
Wordplay, an artificial Intelligence based crossword puzzle solver.

Wordplay, AI based crossword puzzle solver A crossword is a word puzzle that usually takes the form of a square or a rectangular grid of white- and bl

Vaibhaw 4 Nov 16, 2022
Dilated Convolution with Learnable Spacings PyTorch

Dilated-Convolution-with-Learnable-Spacings-PyTorch Ismail Khalfaoui Hassani Dilated Convolution with Learnable Spacings (abbreviated to DCLS) is a no

15 Dec 09, 2022
QilingLab challenge writeup

qiling lab writeup shielder 在 2021/7/21 發布了 QilingLab 來幫助學習 qiling framwork 的用法,剛好最近有用到,順手解了一下並寫了一下 writeup。 前情提要 Qiling 是一款功能強大的模擬框架,和 qemu user mode

Yuan 17 Nov 17, 2022
Repo for FUZE project. I will also publish some Linux kernel LPE exploits for various real world kernel vulnerabilities here. the samples are uploaded for education purposes for red and blue teams.

Linux_kernel_exploits Some Linux kernel exploits for various real world kernel vulnerabilities here. More exploits are yet to come. This repo contains

Wei Wu 472 Dec 21, 2022
Space robot - (Course Project) Using the space robot to capture the target satellite that is disabled and spinning, then stabilize and fix it up

Space robot - (Course Project) Using the space robot to capture the target satellite that is disabled and spinning, then stabilize and fix it up

Mingrui Yu 3 Jan 07, 2022
Aesara is a Python library that allows one to define, optimize, and efficiently evaluate mathematical expressions involving multi-dimensional arrays.

Aesara is a Python library that allows one to define, optimize, and efficiently evaluate mathematical expressions involving multi-dimensional arrays.

Aesara 898 Jan 07, 2023