当前位置:网站首页>[point cloud series] pnp-3d: a plug and play for 3D point clouds
[point cloud series] pnp-3d: a plug and play for 3D point clouds
2022-04-23 07:20:00 【^_^ Min Fei】
List of articles
Recently, I finally have time to read the paper , It's really a happy thing to have time to read papers , You can enter again .
1. Summary
The Journal Conference :TPAMI Top issue 2021 year , only 8 Page content
Code :https://github.com/ShiQiu0419/pnp-3d
2. motivation
Provide a plug and play enhanced feature expression module , Enhance the effectiveness of the existing point cloud analysis network , Nature is
- Provide a system with enhance Existing network and Light The characteristic expression of , have Universal sex .
- Enhancement : Local geometry & Characteristic information ( Focus on geometric relevance )
- Portability : Global bilateral regularization ( similar attention Pruning version of )
- generality : Point cloud classification 、 Point cloud semantic segmentation 、 Object detection
The following figure is an intuitive effect

3. Method
The following is the network architecture of the proposed method , It mainly includes two parts : Local content fusion + Global bilateral regularization

3.1 Local content fusion
Purpose : be based on 3D Space geometry Relevance To merge Local geometry and Feature content
practice : There are two branches , Yes Local geometry and Feature content Respectively carry out correlation coding , The final splicing .
3.1.1 Local geometric branching :
Purpose : Location coding can provide information in a certain order ;

In the process , There are three steps :
- Use DGCNN (EdgeConv) Generate nearest neighbor KNN,
As shown in the following formula , Any point p i p_i pi Of K The nearest neighbor is expressed as N ( p i ) \mathcal{N}(p_i) N(pi)

- Build local geometry
This step is the core of the proposed method , For each point p i p_i pi, Its local geometry includes itself and the differences between it and its nearest neighbors p i k − p i p_{i_k}-p_i pik−pi, So the whole is expressed as [ p i ; p i k − p i ] [p_i;p_{i_k}-p_i] [pi;pik−pi], Because there are two parts , Each part is 3 passageway , Plus there is k A close neighbor , All together kx6 Dimension information input ;
Then the global geometry of all points is represented as P ~ \mathcal{\tilde{P}} P~.

- Coding Geometry
Here is something similar to the routine PointNet Operation , about P ~ \mathcal{\tilde{P}} P~ Conduct 1x1 The size of MLP+BN+ReLu after , after maxpooling Get the largest one to represent the local geometric information of the whole ;

3.1.2 Feature content branch :

Purpose : The extracted features
This is actually the same as 3.1.1 It's the same step , The only difference is that this is for features , And the above is for the point . So I'll just list the formula :
MLP It's with the above 3.1.1 Shared .

3.1.3 polymerization
Fusion geometry 3.1.1 And feature content 3.1.2 Information together .


3.1.4 Code implementation
Definition :

Realization :

3.2 Global bilateral regularization
Purpose : Enhance feature expressiveness .
practice : Firstly, it is divided into point based and channel based feature information , Then regularization aggregation , Form the final expression .

3.2.1 Channel based information (channel-wise)

The essence is to learn the weight of features for each channel , Then use the average result to express , The formula is as follows :
Actually sum attetention The effect is basically the same , The only difference is that not all channels are used here , But with the C r \frac{C}{r} rC Channels . here r > 2 r>2 r>2, It's a reduction factor , Mainly to reduce the output dimension , In the experiment 8, Because it reduces the amount of calculation , So than attention Much lighter . The specific operation is also :1. Multiply the matrix first ;2. RELU Activate ;3. Average pool solution , obtain g c g_c gc.
Formula analysis : there W c \mathbf{W_c} Wc Is a weight matrix ,ReLu One is to provide nonlinear effects , The other is to satisfy that the output is nonnegative , This requirement comes from the formula (6). Actually , That is, it is used for a 1 Dimensional convolution , Input is F L \mathcal{F}_L FL, The output is F L W c \mathcal{F}_L\mathbf{W}_c FLWc, What is involved is only the change of dimension .

3.2.2 Point based information (point-wise)

Operation consistent with channel based , Direct column formula :

3.2.3 polymerization

It's used here 1. First, carry out the outer product operation ; then 2. take a square root , As formula (6) Shown ;
Then by the formula (8) obtain F G \mathcal{F}_G FG The expression of ;
The final output is the formula (9), Its meaning is :

The formula (8): This is a reference [42], Added two links .
The formula (9): More distinctive and representative feature output is needed , And the formula (8) Because it is obtained from the average , It represents the usual pattern , Can be filtered out . Therefore, the final output feature is to filter out the distinguishing feature map of common patterns .

3. 2. 4 Code implementation
Here are the corresponding two convolutions , Dimensions have changed , there 8 Corresponding to r r r
Definition :

Realization :

4. effect
4.1 classification
Input :1024 A little bit
Which tested RS-CNN+PnP-3D And PointNet++ + PnP-3D: In every SA Add... After the layer PnP-3D.
The rest is added after each convolution block PnP-3D

4.2 Semantic segmentation

4.3 object detection

4.4 Comparison with the existing baseline for the task

4.5 Ablation Experiment - The formula
chart 4: For the formula 4、5、9 The ablation experiment , It can be found that all use average pooling and The subtraction strategy is the best .
chart 5: The formula for the final aggregation , It is found that the geometric average is optimal .


4.6 Ablation Experiment -attention

4.7 visualization
To be honest , Like an enhancement on the edge , But it's not so obvious .

Here we mainly talk about , The proposed module is used , Located at the core .

5. reflection
The overall approach is really simple , But simple and effective . The experiment is really full , The code is simple .
版权声明
本文为[^_^ Min Fei]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204230611136273.html
边栏推荐
- Component based learning (3) path and group annotations in arouter
- 机器学习 三: 基于逻辑回归的分类预测
- What did you do during the internship
- BottomSheetDialogFragment + ViewPager+Fragment+RecyclerView 滑动问题
- PyMySQL连接数据库
- MySQL notes 5_ Operation data
- 【2021年新书推荐】Practical Node-RED Programming
- The Cora dataset was trained and tested using the official torch GCN
- MySQL notes 1_ database
- Data class of kotlin journey
猜你喜欢

免费使用OriginPro学习版

Gephi教程【1】安装
![[2021 book recommendation] artistic intelligence for IOT Cookbook](/img/8a/3ff45a911becb895e6dd9e061ac252.png)
[2021 book recommendation] artistic intelligence for IOT Cookbook

1.2 初试PyTorch神经网络

Machine learning III: classification prediction based on logistic regression

机器学习 三: 基于逻辑回归的分类预测

Visual Studio 2019安装与使用

face_recognition人脸检测

Component based learning (1) idea and Implementation

图像分类白盒对抗攻击技术总结
随机推荐
Pytorch trains the basic process of a network in five steps
第2章 Pytorch基础1
DCMTK (dcm4che) works together with dicoogle
BottomSheetDialogFragment 与 ListView RecyclerView ScrollView 滑动冲突问题
C connection of new world Internet of things cloud platform (simple understanding version)
给女朋友写个微信双开小工具
c语言编写一个猜数字游戏编写
ArcGIS License Server Administrator 无法启动解决方法
Cancel remote dependency and use local dependency
去掉状态栏
【2021年新书推荐】Red Hat RHCSA 8 Cert Guide: EX200
读书小记——Activity
ThreadLocal,看我就够了!
face_recognition人脸检测
Binder mechanism principle
Bottom navigation bar based on bottomnavigationview
[dynamic programming] longest increasing subsequence
[dynamic programming] different binary search trees
Visual Studio 2019安装与使用
微信小程序 使用wxml2canvas插件生成图片部分问题记录