当前位置：网站首页>[point cloud series] foldingnet: point cloud auto encoder via deep grid deformation

[point cloud series] foldingnet: point cloud auto encoder via deep grid deformation

2022-04-23 13:18:00 【^_^ Min Fei】

List of articles

1. Summary
2. motivation
3. thought
4. experimental result
6. Reference resources

Inventory clearing series

1. Summary

subject ：FoldingNet: Point Cloud Auto encoder via Deep Grid Deformation (CVPR’18 spotlight)
The paper ：https://openaccess.thecvf.com/content_cvpr_2018/papers/Yang_FoldingNet_Point_Cloud_CVPR_2018_paper.pdf
Supplementary materials ：https://openaccess.thecvf.com/content_cvpr_2018/Supplemental/1129-supp.pdf
Code ：https://www.merl.com/research/license#FoldingNet
sketch ： be based on 2D The idea of paper folding came into being 3D object

Follow up extension ：

FoldingNet++
Real-time Soft Robot 3D Proprioception via Deep Vision based Sensing

Similar work ：
AtlasNet, It is the use of multiple grid Block to initialize 2D grid.

2. motivation

Can neural networks learn how to fold paper ？
3D Most of the point cloud data comes from the small area of the object surface, which can be regarded as 2D Fluid , Can pass 2D After a series of transformations, we get .
Insert picture description here

3. thought

The overall framework ： Main design decoder

Mainly designed the decoder ,Encoder Use it directly PointNet Partial and simple graph ideas .

Insert picture description here
simply , Is to copy the hidden code N Share , Then with 2D The mesh is spliced , formation Nx（D+K） Hidden code of dimension , Obtain the point cloud with location information through learning .

In fact, it can be simply understood as a definition of Geometry ： High dimensional data in nature can be expressed as low dimensional nonlinear manifolds . Here we use this idea to think of the three-dimensional point cloud as a two-dimensional manifold , Then a simple grid is used to simulate in two dimensions . So it is limited in that it can only simulate 3D When there is no ring in the , Not when there is a ring . That's why there's a back FoldingNet++ The emergence of is used to solve the problem of the emergence of rings .
Insert picture description here

Graph based encoder Encoder

Insert picture description here

Encoder = Multilayer perceptron + Graph based maximum pooling layer , The two are spliced together to form an encoder ;

The composition of the picture ：16-KNN, For each point , The local covariance matrix of the calculator , Use $3\times3$ Nuclear computing , Then vectorize it into $1\times9$ , So the input $n\times3$ –> $n\times9$ . The combination of the two is $n\times 12$ .

hypothesis KNN The adjacency matrix of a graph is A, Input is X. Then the output matrix is formula （2）, among K Is the characteristic mapping matrix , Each input is a formula （3）. The formula （3） Calculate local feature representation , Topological certainty .
Insert picture description here

be based on Folding The decoder of Decoder

Insert picture description here

decoder ： Use 2 A continuous 3 Layer perceptron + 2D grid.
Input ： Copy the m The characteristics of a share , Each feature 512 Dimension comes from encoder + Copy the m Share of 2D grid, each 2 Whitman's sign = $\times 514$ .
there grid Use a square ,m=2025, Enter the number of points n=2048.

So what is it called Folding Operation? ？
This is the operation of the whole decoder . That is to say , Copied codewords and 2D grid The stitching of a point-based multilayer perceptron is called Folding operation .

The role of using two three-layer perceptrons ？

The first is from 2D grid To 3D Spatial Folding operation ;
The second is in 3D Space only Folding operation , Produce the final surface.

The theoretical analysis ：
Assume 2Dgrid The input is a matrix $U$ , The codeword output by the encoder is $\theta$ , Every behavior of the matrix $u_i$
Through splicing and MLP after , It can be seen as $f([u_i, \theta])$ , The formula can be regarded as through codeword $\theta$ Re parameterization of high-dimensional functions , because MLP It can be close to nonlinearity , Naturally, you can folding operation .