当前位置:网站首页>[point cloud series] multi view neural human rendering (NHR)
[point cloud series] multi view neural human rendering (NHR)
2022-04-23 13:18:00 【^_^ Min Fei】
List of articles
1. Summary
Yu Jingyi's team work ,CVPR2020, Neural rendering series
Address of thesis :https://openaccess.thecvf.com/content_CVPR_2020/papers/Wu_Multi-View_Neural_Human_Rendering_CVPR_2020_paper.pdf
Project address :https://wuminye.github.io/NHR/
Data sets :https://wuminye.github.io/NHR/datasets.html
2. motivation
Specifically for human body rendering end-to-end frame (NHR): Use a little cloud PointNet++ To extract 3D features + Project to 2D Smooth CNN To deal with noise and deformity . In essence, point cloud is introduced to guide the rendering method .
3. Method
flow chart
The overall framework
It includes three modules :
- feature extraction (FE)
- Projection and rasterization (PR)
- Rendering (RE)
modular 1: feature extraction (FE)
Ψ f e \varPsi_{fe} Ψfe: PointNet++ Feature extraction operations , Remove classification branches , Keep only split branches as FE The branch of .
D t D_t Dt: Feature descriptor of point cloud
V = v i V={v_i} V=vi: Normalized viewing angle direction , v i = p t i − o ∣ ∣ p t i − o ∣ ∣ 2 v^i = \frac{p^i_t-o}{||p^i_t-o||}_2 vi=∣∣pti−o∣∣pti−o2, among o o o Is the projection center of the target angle camera .
{ . } \{.\} {
.}: Indicates splicing , This refers to the mosaic color and the viewing angle direction of the normalized point , The spliced features are used as initialization point attributes for feature extraction .
φ n o r m \varphi_{norm} φnorm Indicates that the point coordinates have been normalized .
modular 2: Projection and rasterization (PR)
S S S: After projection 2D Characteristics of figure , among S x , y = d t i S_{x,y}=d^i_t Sx,y=dti, d t i d^i_t dti It's No i i i Feature descriptor of a point .
E E E: Depth map of the current view
Objective mark phase machine ginseng Count Target camera parameters Objective mark phase machine ginseng Count : K ^ \hat{K} K^、 T ^ \hat{T} T^
can learn xi Of ginseng Count Learnable parameters can learn xi Of ginseng Count : θ d \theta_d θd
ψ p r \psi_{pr} ψpr: The whole process of projection and rasterization
modular 3: Rendering (RE)
ψ r e n d e r \psi_{render} ψrender: An improved version of U-Net, Output 4 passageway , The first three channels are RGB Images I ∗ I* I∗, The last passage is mask yes M ∗ M* M∗, Use sigmoid.
Loss of training
L1 Loss + Loss of perception
n b n_b nb:batch_size size
I i ∗ I_i* Ii∗、 M i ∗ M_i* Mi∗: The first i i i A graph of rendered output and mask.
ψ v g g \psi_{vgg} ψvgg: Extract the... Respectively 2 Tier and tier 4 layer VGG-19 Characteristics of
Geometric improvement
To refine the geometry , Rendered a dense set of new views , And use the generated mask mask As an outline , And give space engraving or contour shape for reconstruction .
Due to multi view stereo input ( In fact, it is a rough point cloud input ) There may be empty places or sheltered areas .
Mask And shape generation : By training the rendering model, we get something similar to RGB Cutout , Then render on a new viewpoint set with unified sampling mask, Each has a corresponding camera parameter , The size is 800x600. And then , have access to shape-from-silhouettes(SfS) To reconstruct the human body mesh.
Point sampling and coloring : It can be done by MVS Calculate the corresponding color from the point cloud on the , Use the nearest neighbor .
Hole completion :
Completion block mechanism : For each point u t i ∈ U t u^i_t\in U_t uti∈Ut, And P t P_t Pt Medium p t i p^i_t pti Euclidean distance of a point ϕ ( u t i , p i ) t ) \phi(u^i_t,p^i)t) ϕ(uti,pi)t) than P ^ t − U t \hat{P}_t-U_t P^t−Ut Big . So set the threshold τ 1 \tau_1 τ1 As formula (5): The experiment is set to 0.2
Then calculate P t ^ \hat{P_t} Pt^ In the middle p t j p^j_t ptj Of Euclidean distance < Number of threshold points , Remember to do s t i = # { b t i ∣ b t i < τ 1 } s^i_t=\#\{b^i_t|b^i_t<\tau_1\} sti=#{
bti∣bti<τ1}
And then use 15 individual bins Calculation s t i s^i_t sti All histograms of , By bisecting the maximum distance value 15 individual bins. As observed in the first bin It contains a more important point than the second , So use the first bin The maximum distance is used as the second threshold : τ 2 \tau_2 τ2 To select the last point value :
The figure below shows that after the hole is filled , It can reduce the flicker when changing the viewing point .
Be careful : The final set will still have artifacts , Because its quality depends on the threshold τ \tau τ
The figure shows how to set the threshold more intuitively , And distance measurement .
4. experiment
Data sets
adopt 80 Multiple camera systems collect 5 A sequence of . Per second 25 frame . All sequences are in 8-24 second . Characters wear different clothes to do different actions .
Each sequence includes :RGB Images 、 prospects mask,RGB Point cloud sequence and camera calibration .
Experimental results :
The experiment shows that , Take full advantage of the benefits of point clouds working with images .
stay 5 Comparison of effects on data sets :
Visualize the color part of the point cloud feature map :
5. summary
In essence, it also adopts rough point cloud + Good picture + Some geometric tips to get good results .
版权声明
本文为[^_^ Min Fei]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204230611136796.html
边栏推荐
- 超40W奖金池等你来战!第二届“长沙银行杯”腾讯云启创新大赛火热来袭!
- MySQL basic statement query
- XML
- decast id.var measure.var数据拆分与合并
- Lpddr4 notes
- X509 parsing
- 51 single chip microcomputer stepping motor control system based on LabVIEW upper computer (upper computer code + lower computer source code + ad schematic + 51 complete development environment)
- How to build a line of code with M4 qprotex
- The first lesson is canvas, showing a small case
- XML
猜你喜欢
Ding ~ your scholarship has arrived! C certified enterprise scholarship list released
[dynamic programming] 221 Largest Square
C语言之字符串与字符数组的区别
Imx6ull QEMU bare metal tutorial 2: usdhc SD card
filter()遍历Array异常友好
CSDN College Club "famous teacher college trip" -- Hunan Normal University Station
9419 page analysis of the latest first-line Internet Android interview questions
普通大学生如何拿到大厂offer?敖丙教你一招致胜!
8086 of x86 architecture
【动态规划】221. 最大正方形
随机推荐
100 lectures on practical application cases of Excel (VIII) - report connection function of Excel
LeetCode_DFS_中等_695.岛屿的最大面积
[51 single chip microcomputer traffic light simulation]
torch. Where can transfer gradient
100 GIS practical application cases (53) - making three-dimensional image map as the base map of urban spatial pattern analysis
The filter() traverses the array, which is extremely friendly
Use Proteus to simulate STM32 ultrasonic srf04 ranging! Code+Proteus
[untitled] PID control TT encoder motor
async void 导致程序崩溃
2020年最新字节跳动Android开发者常见面试题及详细解析
「玩转Lighthouse」轻量应用服务器自建DNS解析服务器
Analysis of the latest Android high frequency interview questions in 2020 (BAT TMD JD Xiaomi)
web三大组件之Filter、Listener
Lpddr4 notes
SHA512 / 384 principle and C language implementation (with source code)
C语言之字符串与字符数组的区别
7_ The cell type scores obtained by addmodule and gene addition method are compared in space
“湘见”技术沙龙 | 程序员&CSDN的进阶之路
Example interview | sun Guanghao: College Club grows and starts a business with me
Common interview questions and detailed analysis of the latest Android developers in 2020