当前位置:网站首页>S TYLE N E RF: A S TYLE - BASED 3D-A WARE G ENERA - TOR FOR H IGH - RESOLUTION I MAGE S YNTHESIS
S TYLE N E RF: A S TYLE - BASED 3D-A WARE G ENERA - TOR FOR H IGH - RESOLUTION I MAGE S YNTHESIS
2022-04-21 12:50:00 【_ Summer tree】

List of articles
Abstract
StyleNeRF:
- With multi view consistency 3D Perceptual generation model
- Based on unorganized 2D Image training .
- combination NeRF and Style based generator , be used for : Improve the rendering effect and quality of high-resolution images 3D Uniformity ( The goal is )
- Use only volume rendering To produce low resolution feature mapping , Then gradually in 2D Up sampling to solve the problem of rendering effect .
- Ways to mitigate inconsistencies :
- a better unsampler
- New loss of regularization
- ……
- The effect achieved : StyleNerf It can quickly and synthesize high-resolution images , And retain 3D Uniformity .
- You can control the camera poses And different levels of style . This can be used to generate an invisible perspective .
- It also supports challenging tasks , Includes zooming in and out 、 Style blending 、 Inversion and semantic editing
Problems with existing methods :
- High resolution images cannot be synthesized
- Produce obvious 3D Inconsistent artifacts .
- Lack of control over style attributes and explicit camera pose
Method

- Comparison between up sampling method and other methods , Our method can maintain a good 3D Uniformity .

3.1 IMAGE SYNTHESIS AS NEURAL IMPLICIT FIELD RENDERING
Generative style based NeRF
- To model high-frequency details , We map x and d From each dimension to Fourier characteristics (fourier feature)

- By using style vectors w Adjust the NeRF To formalize StyleNeRF Express , As shown below
- f It's a mapping network , Move the noise vector from Mapping spherical Gaussian space to style space W.
- g w i ( ⋅ ) g_w^i(\cdot) gwi(⋅) It means the first one i By entering a style vector ω \omega ω Adjust MLP layer
- ϕ ω n ( x ) \phi_{\omega}^n(x) ϕωn(x) yes x Point of the first n Layer characteristics .
- We use the extracted features to predict density and color .

- among hσ and hc It can be linear projection or 2 layer MLP.
- front min(nσ, nc) Layers are shared in the network .
Volume Rendering
- Let's assume that the camera is on a unit sphere , Pointing with a fixed field of view (FOV) The origin of .
- We sample the pitch and yaw of the camera from a uniform or Gaussian distribution according to the data set (pitch & yaw).
- Render the image I. ( Consistent with the basic formula )
- and NeRF equally , Used stratified and hierarchical sampling
Challenges
- these models cost much more computation to render an image at the exact resolution
- consumes much more memory to cache the intermediate results for gradient back-propagation during
training
3.2 Approximation of high resolution image generation
2D The reason for fast image generation
- Each pixel only needs a single forward pass through the network ;
- Image features are generated from coarse to fine , The higher the resolution of the feature map, the less the number of channels , To save memory .
By aggregating features early into... Before calculating the final color 2D Space to partially realize the first point . , We will work out the formula 4 Adjusted for :



We use up-sample The low resolution feature space is approximated to the high resolution feature space .

- Recursive insertion of up sampling operator can realize efficient high-resolution image synthesis , Because the volume rendering with large amount of calculation only needs to generate low-resolution feature map .
- When fewer channels are used for higher resolution , Efficiency will be further improved .
Although early aggregation and upsampling operations can accelerate the rendering process of high-resolution image synthesis , But they destroy NeRF Inherent consistency of .
How do inconsistencies result ?
- ,the resulting model contains non-linear transformations to capture spurious correlations in 2D observation, mainly when substantial ambiguity exists.
- Second, such a pixel-space operation like up-sampling would compromise 3D consistency.
3.3 PRESERVING 3D CONSISTENCY
Unsampler design
We achieve the balance between consistency and image quality by combining these two approaches (see Figure 2).
For any input feature mapping X ∈ R N ∗ N ∗ D X \in R^{N * N * D} X∈RN∗N∗D:

- ψ θ : R D → R 4 D \psi_{\theta}:R^D \rightarrow R^{4D} ψθ:RD→R4D It's two levels of a science department MLP.
- K Is a fixed fuzzy kernel
NeRF path regularization
Regularize the model output to match the original path ( equation (4)).
This is done by resampling the pixels on the output and comparing them with NeRF The generated pixels are compared to achieve :

- S Is a collection of randomly sampled pixels .
- Rin and Rout It's through NeRF Generated low resolution image and StyleNeRF Generated high-resolution images The speed of light of the corresponding pixel .
Remove view direction condition
Predicting colors with view direction condition would give the model additional freedom to capture spurious correlations and dataset bias, especially if only a single-view target is provided.
Predicting colors using view orientation conditions will provide additional freedom for the model to capture false correlations and dataset deviations , Especially when only single view targets are provided . So we removed the view direction to improve consistency . Pictured 8 Shown .

Fix 2D noise injection
Studies have shown that : Injecting noise per pixel can improve the model's sensitivity to random changes ( Like hair 、 Stubble ) Modeling capabilities of
Our default solution is to exchange the ability of the model to capture changes by eliminating noise injection .
We have also proposed based on StyleNeRF A novel geometric perceptual noise injection method for estimating the surface .( See appendix A3)
3.4 StyleNeRF framework
Mapping Network
Sample from the standard Gaussian distribution latent codes, And pass by mapping network To deal with . Finally, the output vector is broadcast to synthesis network

Synthesis Network
We use it NeRF++ As styleNeRF The backbone of .
NeRF++ By a unit sphere in the foreground NeRF And a background parameterized by an inverted sphere NeRF form .
Two MLP Used to predict density , among BG Than FG Less parameters .
Then a shared MLP Used to predict color .
Each style condition block consists of an affine transformation (affine transformaton) Layer and a 1×1 Convolution layer (Conv) form .
Conv The group is adjusted with radial transformation style .
Leaky_Relu For nonlinear activation .
The number of blocks depends on the resolution of the input and target image .
Discriminator & Objectives
StyleNeRF Use a device with R1 Regular Unsaturated GAN The goal is .
new NeRF Path regularization Applied to increase 3D Uniformity .
The final loss function is defined in the following :

- G It includes Mapping and synthsis network The generator .
Progressive training
Start training from the bottom to high resolution .
We propose a new three-stage progressive training strategy :
- For the former T1 A picture , Do not make low resolution approximation .
- stay T1-T2 A picture , The gas city and the discriminator increase the output resolution until the target resolution is reached .
- Last , We have a fixed architecture , Continuous training model in high resolution , until T3 A picture .
- Details refer to appendix A4.
experiment
use FFHQ、 MetFaces、AFHQ、CompCars assessment styleNeRF
baseline:
- HoloGAN
- GRAF
- pi-GAN
- CIRAFFE
batch_size 64, T1 = 500k , T2 = 5000k, T3=25000k.
The input resolution is fixed to 32x 32
result


High resolution synthesis

Controllable image synthesis
Camera control :( This effect is not good )

Style blending and interpolation :

Important references
Michael Niemeyer and Andreas Geiger. Giraffe: Representing scenes as compositional generative neural feature fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and
Pattern Recognition, pp. 11453–11464, 2021b.
Kai Zhang, Gernot Riegler, Noah Snavely, and Vladlen Koltun. Nerf++: Analyzing and improving
neural radiance fields. arXiv preprint arXiv:2010.07492, 2020.
problem
版权声明
本文为[_ Summer tree]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204211248375333.html
边栏推荐
- Revit二次开发——创建标高(第八期)
- open-mmlab / mmpose安装、使用教程
- 研讨会回放视频:如何提升Jenkins能力,使其成为真正的DevOps平台
- Title and answer of G3 boiler water treatment certificate in 2022
- Simulated Login of selenium's slider verification code (pig Bajie website)
- AES automatically generates Base64 key encryption and decryption
- 网易云---手机验证码登录
- GSMA宣布:2022 MWC上海延期举办
- Call for Papers | IEEE/IAPR IJCB 2022 会议
- 2022语言与智能技术竞赛再升级,推出NLP四大前沿任务
猜你喜欢

2022年初级会计职称考试经济法基础练习题及答案

A comprehensive understanding of static code analysis

2022年一级注册建筑师考试建筑物理与设备复习题及答案

53w字!阿里首推系统性能优化指南太香了,堪称性能优化最优解

Simulated Login of selenium's slider verification code (pig Bajie website)

CV技术指南免费版知识星球

2022年监理工程师考试质量、投资、进度控制练习题及答案

框架的灵魂------反射

2020年4面美团(多线程+redis

The 2022 language and intelligent technology competition was upgraded to launch four cutting-edge tasks of NLP
随机推荐
实例:用C#.NET手把手教你做微信公众号开发(7)--普通消息处理之位置消息
The 2022 language and intelligent technology competition was upgraded to launch four cutting-edge tasks of NLP
4 years of Android development 13K, completed this 1307 page Android interview full set of real problem analysis, job hopping and salary increase 15K
Shell编程学习(二)变量、运算
第四章 SQL查询之-层次化查询
Machine learning - sklearn-12 (regression family - upper - multiple linear regression, ridge regression, Lasso) (solve multiple collinearity)
制造业数字化转型存在哪些问题
Algorithem_Merge Two Binary Trees
Discussion on content and cost of elastic Architecture - Digital Architecture Design (3)
一文了解全面静态代码分析
[fun bath] take you to IELTS task 1 essay 7 + -- Dynamic + static chart & Mixed chart (Table / pie chart / line graph / bar chart) 2022-4-18
pycharm 跳转到指定行
Binary tree traversal series 01 - recursive traversal and recursive order
Repairing tables with SQL statements
[SQL] sql19 finds the last of all employees_ Name and first_ Name and corresponding Dept_ name
2020batjz Android Senior Engineer Interview Questions - multiple choice questions collection (with answer analysis)
AES自动生成base64密钥加密解密
redis-击穿、穿透、雪崩
字段行相同则合并在另外一个字段sql 语句?
AGP Transform API 被废弃意味着什么?