当前位置：网站首页>[point cloud series] neural opportunity point cloud (NOPC)

[point cloud series] neural opportunity point cloud (NOPC)

2022-04-23 13:18:00 【^_^ Min Fei】

List of articles

1. Summary
2. motivation
3. Method
4. experiment
- Data acquisition and processing ：
- Experimental results ：
5. summary

1. Summary

2020 PAMI Journal content
Project address ：https://wuminye.github.io/NOPC/

Related contents involved ： Image based rendering （IBR）、 Neural rendering 、 Cutout
characteristic ： Combined with point cloud to enhance the rendering effect
Yu Jingyi's team work
Insert picture description here

2. motivation

Conventional Image based opaque shell （Image-Based Opacity Hull, IBOH） Technology can lead to... Due to insufficient sampling Artifacts and overlaps . This problem can be alleviated by using high-quality Geometry , But for Plush object Come on , Obtaining a true and accurate geometric appearance is still a huge challenge . Such objects contain thousands of hair fibers , Because the fibers are very thin and cover each other irregularly , They show a strong perspective related The opacity , This opacity information is difficult to model in terms of geometry and appearance , Even with the latest 3D Scanner , And cannot be fully obtained .

The rendering method proposed by the researchers can make Image based rendering （IBR） And Neural network rendering （Neural Rendering） combination , Take the rough point cloud of the rendered object as the input , Using image data taken from a relatively sparse viewpoint , Render the realistic appearance and accurate opacity of plush objects from a free perspective . At the same time, a photographing system for photographing and collecting real plush object data is proposed . It realizes the high-quality rendering of plush objects from a free perspective . Even if low-quality incomplete 3D point clouds are used , You can also generate realistic renderings .

3. Method

Algorithm flow diagram ：

Here's the picture . From point cloud $P$ among , Learn its corresponding characteristics $F$ . In order to adapt to a new perspective $V$ , We will $P$ and $F$ Project to $V$ To build a perspective independent feature map $M$ . The proposed multi branch framework will $M$ Mapping to $V$ Of RGB Images and a alpha On the channel . The Internet can be used GT RGB Map and alpha Channel to achieve end-to-end training .
The formula is described as follows ：
Insert picture description here
Point cloud representation ： $P=\{\mathbf{p}_i \in \mathbb{R}^3\}^{np}_{i=1}$
Characteristic means ： $=\{\mathbf{f}\in \mathbb{R}^m\}^{np}_{i=1}$ , there $n p$ Refers to the number of points , $n$ Pictures
$\hat{\mathbf{I}_q}$ : The first $q$ It's a perspective RGB chart
$\hat{\mathbf{A}_q}$ : The first $q$ It's a perspective alpha passageway
Camera parameters ： visual angle $V_q$ , $\mathbf{K}_q$ , $\mathbf{E}_q$
$\varPsi$ : Point projection
$R_{\theta}$ ： Neural rendering , Used to generate in perspective $V_q$ Of RGB Map and alpha Access map .
Insert picture description here

Overall network framework ：

say concretely ,NOPC It consists of two modules , Pictured 5：

The first module aims to learn the of each three-dimensional point features , This feature encodes the local geometry and appearance information around 3D points . By projecting all 3D points and their corresponding features to the virtual viewing angle , You can get the feature map from this perspective ;
The second module uses convolutional neural network to extract from the feature map Decode the RGB Images and opaque masks . The convolutional neural network is based on U-net Network structure , Use gated convolution （gated convolution） Instead of conventional convolution , In order to robustly deal with rough or broken 3D geometry . At the same time U-net Based on the original hierarchical structure , From prediction RGB The branch of the image expands new alpha Prediction branch , This branch effectively enhances the performance of the whole network model .

RGB The encoder and decoder of ：
U-Net framework +gated Convolution （ Instead of ordinary convolution ）： It can enhance the ability of denoising and completion
Encoder ：1 Convolution blocks +4 Next sampling block （ Halve the size and double the channel ）
decoder ：4 Upper sample block （ And $\mathbf{M}_q$ Same size ） + 1 Convolution blocks （ Output to 3 passageway ）

Alpha Channel encoder and decoder ：
Alpha The channel is very sensitive to low-level features , For example, image gradients and edges .
Encoder ： 1 Convolution blocks +2 Next sampling block （ Only for RGB Encoder channel 2/3）
decoder ：alpha Encoder +2 Up sampling module +1 Convolution blocks

Data preprocessing ：

calibration ： In the $f$ Calculate the... On the first image $i$ External parameters of a camera , In the frame corresponding to each camera , That is, a perspective $V_q$ .
Insert picture description here
Cutout ： To remove the background

Parameter description ：
$\varepsilon$ :0.2
$j$ : Pixel position

Perspective independent feature map ：

Given point cloud $P$ With its characteristics $F$ , visual angle $V_q$ , Center of the projection $\mathbf{c}_q$ , Known camera parameters $\mathbf{K}_q$ and $\mathbf{E}_q$ , Then each point $\mathbf{p}_i$ Project to ：
Insert picture description here
here [x,y,z] Is the normal three-dimensional coordinate ; [u, v] yes $\mathbf{p}_i$ Coordinates after projection .

And then according to the formula （4） To calculate the perspective independent feature map $\mathbf{M}_q$ , As formula （5）, Its have $m + 3$ Channels .
among , $\vec{d_i}=\frac{\mathbf{p}_i-\mathbf{c}_q}{||\mathbf{p}_i-\mathbf{c}_q||_2}$ , $S_i = \{ (u,v)| \mathbf{p}_i Is in (u,v) Visual points on \}$
Insert picture description here
Gradient loss ： = fi + f0 Gradient of

there $\rho(.)$ Indicates that only the front of the vector is retained $m$ dimension .

Insert picture description here

Nerve opacity rendering

Loss function ：
Insert picture description here
$\Omega{(\mathbf{A}_q, \mathbf{G})}$ : Images I And G Of mask, among G yes alpha passageway A Intersection with point cloud depth map .

4. experiment

NOPC There are a wide range of application scenarios . It can be used in virtual reality （VR） And augmented reality （AR） Content collection and rendering process , Objects with transparency but not easy to model （ For example, people's hair 、 Plush toys, etc ） Display realistically in any virtual 3D scene . It's OK to be with idols AR Real time group photo , The proportion, size and position of idols can be adjusted according to needs , It ensures the realism in any background .
Insert picture description here

Data acquisition and processing ：

Please refer to the description on the front page of the project for the specific data set
Main data set ： hair 、 and Fur .
Insert picture description here

Experimental results ：

Insert picture description here

5. summary

Rendering Rendering ： Good Image + Poor Geometry
Reconstruction The reconstruction ： Learning based feature,matching,proxy estimation,Optimization
Neural Representation = Neural Modeling + Rendering

Well, this article is actually telling us , A poor point cloud + Good picture = You can get a lot of good pictures

版权声明
本文为[^_^ Min Fei]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/04/202204230611136837.html