当前位置:网站首页>A general U-shaped transformer for image restoration
A general U-shaped transformer for image restoration
2022-04-23 06:00:00 【umbrellalalalala】
2021 year 6 month 6 Submitted to arxiv Articles on .ICCV2021 Of Eformer Namely Uformer Based on improvement , It seems worth reading , Simply record .
Know that the account with the same name is released synchronously .
Catalog
One 、 Architecture design
The structure is as shown in the figure :
Compared to the ordinary UNet, The difference is that LeWin Transformer, such Transformer It is also the innovation of this work .
So-called LeWin Transformer, Namely local-enhanced window Transformer, It includes W-MSA and LeFF:
- W-MSA:non-overlapping window-based self-attention, The purpose is to reduce the computational overhead ( Tradition transformer It is calculated globally self-attention, And it's not );
- LeFF: Tradition transformer Feedforward neural network is used in , Can't make good use of local context,LeFF The adoption of can capture local information.
️ Two innovations :
- Put forward LeWin Transformer, introduce UNet
- Three jump connections
Two 、 Main module details
2.1,W-MSA
This is the biggest innovation of this work . ( Reminded ,swin Transformer There are )
First will C×H×W Of X It is divided into N individual C×M×M individual patch, Every patch Deemed to have M×M individual C dimension vector(N = H × W / M²), this C individual vector enter W-MSA in . According to the above formula , A simple understanding is to make X Divided into non overlapping N slice , And then... For each piece self-attention The calculation of .
Author expresses , Although it is carried out on one piece self-attention The calculation of , But in UNet Of encode Stage , Due to the existence of down sampling , So calculate self attention on this piece , Corresponding to the calculation of self attention on the larger receptive field before down sampling .
Adopted relative position encoding, So the calculation formula can be expressed as :
A reference to this location code [48,41] Namely :
[48] Peter Shaw, Jakob Uszkoreit, and Ashish Vaswani. Self-attention with relative position repre-
sentations. arXiv preprint arXiv:1803.02155, 2018.
[41] Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining
Guo. Swin transformer: Hierarchical vision transformer using shifted windows. arXiv preprint
arXiv:2103.14030, 2021.
2.2,LeFF
LeFF yes Incorporating Convolution Designs into Visual Transformers
Invented , Among them Convolution-enhanced image Transformer (CeiT)
Including this design .
The essence is to self-attention Calculate the output of the module N individual token(vector), Rearrange to N × N \sqrt{N} \times \sqrt{N} N×N Of “image”, Then proceed depth-wise Convolution operation of . After watching CeiT The diagram given by the author , Look again Uformer The diagram given by the author , It's not difficult to understand the meaning :
Each linear layer / After the convolution layer , Using all of these GELU Activation function .
(depth-wise A search on the Internet has , The function is to reduce parameters , Speed up computing )
2.3, Three jump connections
UNet The architecture has jump connections , In this work is to encoder Part of the Transformer The output of is passed to decoder part , But there are many ways to use these jump connections to convey information , The author explores three :
- The first is direct concat To come over ;
- The second is : every last decode stage There is one upsampling and two Transformer block, It means the first to use self-attention, The second use cross attention;
- The third is concat Information as key and value Of cross attention.
The author believes that the three are similar , But the first one is a little better , So the first one is used as Uformer Default Settings .
Uformer The details of architecture design are here , I won't take a closer look at other contents .
3、 ... and 、 Calculate the cost
since LeWin Transformer Medium W-MSA Focus on reducing computing overhead , So naturally, let's look at the complexity of the algorithm :
Given feature map X, Dimension for C×H×W, If it's traditional self-attention, So the complexity is O ( H 2 W 2 C ) O(H^2 W^2 C) O(H2W2C), Is divided into M×M Of patch Do it again self-attention, It is O ( H W M 2 M 4 C ) = O ( M 2 H W C ) O(\frac{HW}{M^2}M^4 C)=O(M^2HWC) O(M2HWM4C)=O(M2HWC), Reduced complexity .
Four 、 experimental result
The author did denoising 、 Go to the rain 、 Deblurring experiment .
版权声明
本文为[umbrellalalalala]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204230543451118.html
边栏推荐
- In depth source code analysis servlet first program
- EditorConfig
- Kingdee EAS "general ledger" system calls "de posting" button
- Pytorch notes - observe dataloader & build lenet with torch to process cifar-10 complete code
- 深入理解去噪论文——FFDNet和CBDNet中noise level与噪声方差之间的关系探索
- Implementation of displaying database pictures to browser tables based on thymeleaf
- Pytorch learning record (XI): data enhancement, torchvision Explanation of various functions of transforms
- 编程记录——图片旋转函数scipy.ndimage.rotate()的简单使用和效果观察
- Linear algebra Chapter 1 - determinant
- 数字图像处理基础(冈萨雷斯)二:灰度变换与空间滤波
猜你喜欢
线性代数第三章-矩阵的初等变换与线性方程组
Chapter 3 of linear algebra - Elementary Transformation of matrix and system of linear equations
JVM series (3) -- memory allocation and recycling strategy
On traversal of binary tree
一文读懂当前常用的加密技术体系(对称、非对称、信息摘要、数字签名、数字证书、公钥体系)
类的加载与ClassLoader的理解
编程记录——图片旋转函数scipy.ndimage.rotate()的简单使用和效果观察
深度学习基础——简单了解meta learning(来自李宏毅课程笔记)
PyQy5学习(四):QAbstractButton+QRadioButton+QCheckBox
Pytorch学习记录(十三):循环神经网络((Recurrent Neural Network)
随机推荐
Ptorch learning record (XIII): recurrent neural network
字符串(String)笔记
常用编程记录——parser = argparse.ArgumentParser()
Multithreading and high concurrency (1) -- basic knowledge of threads (implementation, common methods, state)
You cannot access this shared folder because your organization's security policy prevents unauthenticated guests from accessing it
Pytorch学习记录(十二):学习率衰减+正则化
创建二叉树
MySql基础狂神说
Pytorch学习记录(十):数据预处理+Batch Normalization批处理(BN)
异常的处理:抓抛模型
The official website of UMI yarn create @ umijs / UMI app reports an error: the syntax of file name, directory name or volume label is incorrect
Font shape `OMX/cmex/m/n‘ in size <10.53937> not available (Font) size <10.95> substituted.
Anaconda安装PyQt5 和 pyqt5-tools后没有出现designer.exe的问题解决
interviewter:介绍一下MySQL日期函数
Pytorch学习记录(三):神经网络的结构+使用Sequential、Module定义模型
解决报错:ImportError: IProgress not found. Please update jupyter and ipywidgets
Pytorch学习记录(四):参数初始化
Manually delete registered services on Eureka
去噪论文——[Noise2Void,CVPR19]Noise2Void-Learning Denoising from Single Noisy Images
图解HashCode存在的意义