当前位置：网站首页>Mobile/Embedded-CV Model-2017: MobelNets-v1

Mobile/Embedded-CV Model-2017: MobelNets-v1

2022-08-08 09:32:00 【u013250861】

"MobileNets-v1 Original Paper: MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications"

The MobileNet model is a lightweight deep neural network proposed by Google for embedded devices such as mobile phones. The core idea of its use is depthwise separable convolution (depth separable convolution).

I. What is depthwise separable convolution (proposed by Mobilenet-v1)

Assume a network convolutional layer with a convolution kernel size of 3×3, 16 input channels and 32 output channels;

The conventional convolution operation is to apply 32 3×3×16 convolution kernels to the 16-channel input image, then according to the convolution layer parameter calculation formula, convolution calculation + convolution parameter amount + convolution measurementQuantity

The required parameters are 32*(3316+1)=4640.

If 16 convolution kernels (331) with a size of 3×3 are applied to the input image of 16 channels first, 16 feature maps are obtained, and the fusion operation is performed.Before, then use 32 convolution kernels (1116) with a size of 1×1 to traverse the 16 feature maps obtained above. According to the calculation formula of the parameters of the convolution layer, the required parameters are (33116+16) + (111632+32) = 706.

The above is the role of the depthwise separable convolution. In layman's terms, the feature extraction and feature combination of the ordinary convolutional layer are completed and output at one time, while the depthwise separable convolution first uses 33 with a thickness of 1.The convolution kernel (depthwise layered convolution) is used, and the number of channels is adjusted with a 11 convolution kernel (pointwise convolution), and feature extraction and feature combination are performed separately.

It can be seen that the depthwise separable convolution can greatly reduce the parameters of the model, and its specific structure is as follows (the left is the ordinary convolutional layer structure, the right is the depthwise separable convolution structure):

insert image description here

Only one convolution kernel with dimension in_channels is used for feature extraction (no feature combination) when performing deepthwise (DW) convolution;
When performing pointwise (PW) convolution, only output_channels convolution kernels with dimension in_channels 1*1 are used for feature combination.

References:
Lightweight Network - MobileNet
The application of deep learning in image processing (tensorflow2.4 and pytorch1.10 implementation)
Lightweight Network-Mobilenet Series(v1,v2,v3)

原网站

版权声明
本文为[u013250861]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/220/202208080911004837.html

当前位置：网站首页>Mobile/Embedded-CV Model-2017: MobelNets-v1

Mobile/Embedded-CV Model-2017: MobelNets-v1

I. What is depthwise separable convolution (proposed by Mobilenet-v1)

边栏推荐

猜你喜欢

随机推荐