当前位置：网站首页>Implementation of object detection case based on SSD

Implementation of object detection case based on SSD

2022-04-23 17:54:00 【Stephen_ Tao】

List of articles

SSD Introduce
- SSD structure
- SSD Algorithm flow
Case realization

SSD Introduce

SSD It is characterized by :

SSD Combined with the YOLO Return to thought and Faster-RCNN Medium Anchor Mechanism , The multi-scale region of each position in the whole map is used for regression , Both keep YOLO Fast features , It also guarantees the following of window prediction Faster-RCNN It's just as accurate .
SSD The core of is to use convolution kernel to predict a series of Default Bounding Boxes Categories 、 Coordinate offset .

SSD structure

Insert picture description here
With VGG-16 Based on , Use VGG The first five convolutions of , Add from CONV6 At the beginning 5 A convolution structure , Enter picture requirements 300*300.

SSD Algorithm flow

Insert picture description here

Case realization

By building models , Load the trained parameters , Achieve specific types of object detection tasks

Import package

from nets.ssd_net import SSD300
from keras.preprocessing.image import load_img,img_to_array
from imageio import imread
from keras.applications.imagenet_utils import preprocess_input
from utils.ssd_utils import BBoxUtility
import matplotlib.pyplot as plt
import numpy as np
import os

Define initialization function

Function call in the form of class

class SSDTest(object):
    def __init__(self):
        #  Define the identification category 
        self.classes_name = ['Aeroplane', 'Bicycle', 'Bird', 'Boat', 'Bottle',
                             'Bus', 'Car', 'Cat', 'Chair', 'Cow', 'Diningtable',
                             'Dog', 'Horse', 'Motorbike', 'Person', 'Pottedplant',
                             'Sheep', 'Sofa', 'Train', 'Tvmonitor']

        self.classes_nums = len(self.classes_name) + 1
        self.input_shape = (300,300,3)

In the initialization function , Defines the category name of object detection , Total number of categories detected ( Include background categories ), And enter the shape and size of the picture .

Model construction and operation

    def model_process(self):
    	#  establish SSD300 The basic model of 
        model = SSD300(self.input_shape,num_classes=self.classes_nums)
		
		#  Import trained parameters 
        model.load_weights('./ckpt/weights_SSD300.hdf5',by_name=True)
        feature = []
        images_data = []

		#  Traverse images All the pictures in the catalog 
        for pic_name in os.listdir('./images/'):
            img_path = os.path.join('./images/',pic_name)
            img = load_img(img_path,target_size=(self.input_shape[0],self.input_shape[1]))
            img = img_to_array(img)
            feature.append(img)
            images_data.append(imread(img_path))
        inputs = preprocess_input(np.asarray(feature))
        preds = model.predict(inputs)
        print(preds.shape)


        #  Non maximum suppression 
        bbox_util = BBoxUtility(self.classes_nums)
        results = bbox_util.detection_out(preds)
        print(results[0].shape,results[1].shape)
        return results,images_data

Result processing and picture marking

    def tag_picture(self,images,results):
        for i,img in enumerate(images):
            pre_label = results[i][:, 0]
            pre_conf = results[i][:, 1]
            pre_xmin = results[i][:, 2]
            pre_ymin = results[i][:, 3]
            pre_xmax = results[i][:, 4]
            pre_ymax = results[i][:, 5]
            # print("label:{}, probability:{}, xmin:{}, ymin:{}, xmax:{}, ymax:{}".
            # format(pre_label, pre_conf, pre_xmin, pre_ymin, pre_xmax, pre_ymax))

            # Filter out the results with low confidence 
            top_indices = [i for i,conf in enumerate(pre_conf) if conf>=0.6]
            top_conf = pre_conf[top_indices]
            top_label_indices = pre_label[top_indices].tolist()
            top_xmin = pre_xmin[top_indices]
            top_ymin = pre_ymin[top_indices]
            top_xmax = pre_xmax[top_indices]
            top_ymax = pre_ymax[top_indices]
            print("label:{}, probability:{}, xmin:{}, ymin:{}, xmax:{}, ymax:{}".
                  format(top_label_indices, top_conf, top_xmin, top_ymin, top_xmax, top_ymax))

            # print(np.array(images)/255.0)
            #  The plot 
            colors = plt.cm.hsv(np.linspace(0, 1, 21)).tolist()
            plt.figure()
            plt.imshow(img / 255.)
            currentAxis = plt.gca()

            for i in range(top_conf.shape[0]):
                xmin = int(round(top_xmin[i] * img.shape[1]))
                ymin = int(round(top_ymin[i] * img.shape[0]))
                xmax = int(round(top_xmax[i] * img.shape[1]))
                ymax = int(round(top_ymax[i] * img.shape[0]))

                #  Get the prediction probability of the picture , name , Define display colors 
                score = top_conf[i]
                label = int(top_label_indices[i])
                label_name = self.classes_name[label - 1]
                display_txt = '{:0.2f}, {}'.format(score, label_name)
                coords = (xmin, ymin), xmax - xmin + 1, ymax - ymin + 1
                color = colors[label]
                #  Show box 
                currentAxis.add_patch(plt.Rectangle(*coords, fill=False, edgecolor=color, linewidth=2))
                #  The upper left corner shows the probability and name 
                currentAxis.text(xmin, ymin, display_txt, bbox={
    'facecolor': color, 'alpha': 0.5})

            plt.show()

Write the main function

if __name__ == '__main__':
    ssd = SSDTest()
    results,images = ssd.model_process()
    ssd.tag_picture(images,results)

Running results

I chose 4 A picture , The following is a SSD The running results of the target detection algorithm ：
Insert picture description here

You can see , use SSD The effect of target detection is better .