当前位置：网站首页>[point cloud series] a rotation invariant framework for deep point cloud analysis

[point cloud series] a rotation invariant framework for deep point cloud analysis

2022-04-23 07:20:00 【^_^ Min Fei】

List of articles

1. Summary
2. motivation
3. Method
4. experiment
5. limitations

1. Summary

TVCG 2021 Periodical
Code ：https://github.com/nini-lxz/Rotation-Invariant-Point-Cloud-Analysis

2. motivation

The common problem with current methods is ： Rotation invariance is not guaranteed
So this is the guarantee .

Use a low-level semantic Expression of rotation invariance To replace 3D Cartesian coordinate input , It is a bit similar to the process of using hand-designed features with rotation invariance to give the optimization of network science .

3. Method

3.1 Common methods feature extraction $A$

Based on global features $G_i$ + Local features $L_{ij}$ + Nonlinear functions $h_{\theta}$
among $A$ It's a symmetric function .
Insert picture description here

3.2 Rotation invariance

A frame has rotation invariance = The network input is rotation invariant （ In fact, a rotation invariant expression is extracted from the input point cloud to replace the original point cloud as the input ） + The operands are rotation invariant

Network input design

Think about it 4 spot ：

That is, regardless of the input point cloud $S$ How to transform , The extracted expression with rotation invariance remains unchanged . Let the function of extracting rotation invariance be $\Phi$ , Then need to satisfy ：

there R Refer to 3D Arbitrary rotation in coordinates .
Satisfy the formula （2） Easy to use L2 Distance or relative angle as input is too rough , And the information is lost ;
No ambiguity , That is, different local regions have their own rotation invariance expression ;
Need anti noise ;

Network architecture design

Consider two points ：

The network framework cannot contain any rotational operations , For example, you cannot specify the order
The network framework does not include point cloud coordinates , It's just relevant geometric information , For example, distance, angle, etc. as input ;

3.3 Roll invariance expression

Preprocessing ：
First, input the point cloud $S$ Normalize , In the cell sphere .
And then use PointNet++li d query ball To define the proximity point $\{ p_{ij}\}^K_{j=1}$ , Here's the picture 2(a) And (b).
Calculation ：
The extracted features $G_i$ And $p_i$ Local characteristics of $\{L_{ij}\}^K_{j=1}$ . The overall features are shown in the figure 3 Shown ：

Global features $G_i$ : contain 5 Parts of , Pictured 2 (a)&(b) Shown

1). $d_{pi}=||p_i||^2$ : $p_i$ Simple global and rotation invariance
2). $d_{pm_i}$ : $p_i$ And $p_i'$ The local distance is $m_i$ , Select geometric median .
3)–5): $d_{sm_i}$ As the last three parts . among , location $s_i$ + Near point and origin $p_i$ Intersection of extension lines + triangle $p_i-m_i-s_i$ . In the setup of this article , The radius size increases with the network hierarchy .

Local features ：7 Parts of , Pictured 2 As shown in

3.4 Overall network framework

in total 3 layer ,
Yellow box ： Extracted rotation invariant features
Green box ： Point cloud coordinates
Purple box ： Longest distance sampling
Blue box ： Features embedded in the network

first floor ：

Use PointNET++ The way , Sample and group . Use farthest point sampling , Each subset $N_1 A little bit$
Use query ball look for $K_1$ A close neighbor , The build size is $N_1 \times K_1\times 3$ Voxels of , As $S^G_1$
Do two things simultaneously ：
（1） Extract rotation invariant features $I_1$ （ Yellow box ）,
（2） Calculate its global incidence matrix $R_1$ （ Yellow box ）.
At the same time 3 Medium $I_1$ and $R_1$ All input to the regional relationship convolution （ Orange frame ） To get features $F_1$ ( Blue box )

The second floor ：
5. Continue to the smaller sampling area $S_2^G$
6. The features of the first layer are spliced with the sampled points , Form the second characteristic $F^G_2$ ; here $N_2<N_1$ , $K_2>K_1$ It can allow the gradual expansion of the receptive field .
7. Then joining together $F_2^G$ And $F_2^I$ , To eliminate the loss of information , $I_2$ It is a high-level semantic feature obtained by multi-layer perceptron ;
8. Finally, the spliced features are generated $F^C_2$ ., The final joint relevance matrix $R_2$ Get the second layer of features $F_2$ .

The third level ：
9. Continue sampling grouping , obtain $S^G_3$
10. Similar to the operation of the second layer, get the characteristics $F^C_3$
11. Use multi-layer perceptron , then maxpooling To the final feature $F_3$
Insert picture description here
among , Region Relation Convolution The calculation is shown in the figure 5: In essence, it is similar to a attention Let's do it .

4. experiment

Classification accuracy ： It looks good
Insert picture description here
surface 2：
z/SO3: The training set has z Axis rotation enhancement , The test set is arbitrarily rotated ;
z/z: Training and testing are all about z The rotation of the shaft is enhanced
SO3/SO3: Training and testing are arbitrary rotation ;
The stability of the proposed method is illustrated ？
in the light of
surface 3：
NR/NR： Training tests are all rotation
NR/AR： Training ： No rotation ; test ： Any rotation
There is no difference between the two , Why? ？ In fact, it is because the proposed method is originally aimed at rotation invariance , So the extracted features have this performance , Even if the training set is not for different rotations , It should also have rotation invariance . It also shows that this method has this performance .
Insert picture description here
Visual effects ：

Segmentation rendering ：
Insert picture description here
Ablation Experiment ：

5. limitations

Simple method , The effect is not good ;
In fact, the processing of noise is limited ;
It depends on the invariance of manual design , Whether it is effective for all objects , In doubt .

版权声明
本文为[^_^ Min Fei]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/04/202204230611136406.html