Unoffical implementation about Image Super-Resolution via Iterative Refinement by Pytorch

Overview

Image Super-Resolution via Iterative Refinement

Paper | Project

Brief

This is a unoffical implementation about Image Super-Resolution via Iterative Refinement(SR3) by Pytorch.

There are some implement details with paper description, which maybe different with actual SR3 structure due to details missing.

  • We used the ResNet block and channel concatenation style like vanilla DDPM.
  • We used the attention mechanism in low resolution feature(16×16) like vanilla DDPM.
  • We encoding the $\gamma$ as FilM strcutrue did in WaveGrad, and embedding it without affine transformation.

Status

Conditional generation(super resolution)

  • 16×16 -> 128×128 on FFHQ-CelebaHQ
  • 64×64 -> 512×512 on FFHQ-CelebaHQ

Unconditional generation

  • 128×128 face generation on FFHQ
  • 1024×1024 face generation by a cascade of 3 models

Training Step

  • log / logger
  • metrics evaluation
  • multi-gpu support
  • resume training / pretrained model

Results

We set the maximum reverse steps budget to 2000 now.

Tasks/Metrics SSIM(+) PSNR(+) FID(-) IS(+)
16×16 -> 128×128 0.675 23.26 - -
64×64 -> 512×512 - -
128×128 - -
1024×1024 - -
show show show
show show show

Usage

Pretrained Model

This paper is based on "Denoising Diffusion Probabilistic Models", and we build both DDPM/SR3 network structure, which use timesteps/gama as model embedding input, respectively. In our experiments, SR3 model can achieve better visual results with same reverse steps and learning rate. You can select the json files with annotated suffix names to train different model.

Tasks Google Drive
16×16 -> 128×128 on FFHQ-CelebaHQ SR3
128×128 face generation on FFHQ SR3
# Download the pretrain model and edit [sr|sample]_[ddpm|sr3]_[resolution option].json about "resume_state":
"resume_state": [your pretrain model path]

We have not trained the model until converged for time reason, which means there are a lot room to optimization.

Data Prepare

New Start

If you didn't have the data, you can prepare it by following steps:

Download the dataset and prepare it in LMDB or PNG format using script.

# Resize to get 16×16 LR_IMGS and 128×128 HR_IMGS, then prepare 128×128 Fake SR_IMGS by bicubic interpolation
python prepare.py  --path [dataset root]  --out [output root] --size 16,128 -l

then you need to change the datasets config to your data path and image resolution:

"datasets": {
    "train": {
        "dataroot": "dataset/ffhq_16_128", // [output root] in prepare.py script
        "l_resolution": 16, // low resolution need to super_resolution
        "r_resolution": 128, // high resolution
        "datatype": "lmdb", //lmdb or img, path of img files
    },
    "val": {
        "dataroot": "dataset/celebahq_16_128", // [output root] in prepare.py script
    }
},

Own Data

You also can use your image data by following steps.

At first, you should organize images layout like this:

# set the high/low resolution images, bicubic interpolation images path
dataset/celebahq_16_128/
├── hr_128
├── lr_16
└── sr_16_128

then you need to change the dataset config to your data path and image resolution:

"datasets": {
    "train|val": {
        "dataroot": "dataset/celebahq_16_128",
        "l_resolution": 16, // low resolution need to super_resolution
        "r_resolution": 128, // high resolution
        "datatype": "img", //lmdb or img, path of img files
    }
},

Training/Resume Training

# Use sr.py and sample.py to train the super resolution task and unconditional generation task, respectively.
# Edit json files to adjust network structure and hyperparameters
python sr.py -p train -c config/sr_sr3.json

Test/Evaluation

# Edit json to add pretrain model path and run the evaluation 
python sr.py -p val -c config/sr_sr3.json

Evaluation Alone

# Quantitative evaluation using SSIM/PSNR metrics on given dataset root
python eval.py -p [dataset root]

Acknowledge

Our work is based on the following theoretical works:

and we are benefit a lot from following projects:

Owner
LiangWei Jiang
LiangWei Jiang
A coin flip game in which you can put the amount of money below or equal to 1000 and then choose heads or tail

COIN_FLIPPY ##This is a simple example package. You can use Github-flavored Markdown to write your content. Coinflippy A coin flip game in which you c

2 Dec 26, 2021
Code for "ATISS: Autoregressive Transformers for Indoor Scene Synthesis", NeurIPS 2021

ATISS: Autoregressive Transformers for Indoor Scene Synthesis This repository contains the code that accompanies our paper ATISS: Autoregressive Trans

138 Dec 22, 2022
Code and models for "Rethinking Deep Image Prior for Denoising" (ICCV 2021)

DIP-denosing This is a code repo for Rethinking Deep Image Prior for Denoising (ICCV 2021). Addressing the relationship between Deep image prior and e

Computer Vision Lab. @ GIST 36 Dec 29, 2022
BlockUnexpectedPackets - Preventing BungeeCord CPU overload due to Layer 7 DDoS attacks by scanning BungeeCord's logs

BlockUnexpectedPackets This script automatically blocks DDoS attacks that are sp

SparklyPower 3 Mar 31, 2022
Official PyTorch code for CVPR 2020 paper "Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision"

Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision https://arxiv.org/abs/2003.00393 Abstract Active learning (AL) aims to min

Denis 29 Nov 21, 2022
Pyramid Grafting Network for One-Stage High Resolution Saliency Detection. CVPR 2022

PGNet Pyramid Grafting Network for One-Stage High Resolution Saliency Detection. CVPR 2022, CVPR 2022 (arXiv 2204.05041) Abstract Recent salient objec

CVTEAM 109 Dec 05, 2022
QueryDet: Cascaded Sparse Query for Accelerating High-Resolution SmallObject Detection

QueryDet-PyTorch This repository is the official implementation of our paper: QueryDet: Cascaded Sparse Query for Accelerating High-Resolution Small O

Chenhongyi Yang 276 Dec 31, 2022
You are AllSet: A Multiset Function Framework for Hypergraph Neural Networks.

AllSet This is the repo for our paper: You are AllSet: A Multiset Function Framework for Hypergraph Neural Networks. We prepared all codes and a subse

Jianhao 51 Dec 24, 2022
Official code for 'Robust Siamese Object Tracking for Unmanned Aerial Manipulator' and offical introduction to UAMT100 benchmark

SiamSA: Robust Siamese Object Tracking for Unmanned Aerial Manipulator Demo video 📹 Our video on Youtube and bilibili demonstrates the evaluation of

Intelligent Vision for Robotics in Complex Environment 12 Dec 18, 2022
This is the source code of the 1st place solution for segmentation task (with Dice 90.32%) in 2021 CCF BDCI challenge.

1st place solution in CCF BDCI 2021 ULSEG challenge This is the source code of the 1st place solution for ultrasound image angioma segmentation task (

Chenxu Peng 30 Nov 22, 2022
GalaXC: Graph Neural Networks with Labelwise Attention for Extreme Classification

GalaXC GalaXC: Graph Neural Networks with Labelwise Attention for Extreme Classification @InProceedings{Saini21, author = {Saini, D. and Jain,

Extreme Classification 28 Dec 05, 2022
This code is a near-infrared spectrum modeling method based on PCA and pls

Nirs-Pls-Corn This code is a near-infrared spectrum modeling method based on PCA and pls 近红外光谱分析技术属于交叉领域,需要化学、计算机科学、生物科学等多领域的合作。为此,在(北邮邮电大学杨辉华老师团队)指导下

Fu Pengyou 6 Dec 17, 2022
Graph Representation Learning via Graphical Mutual Information Maximization

GMI (Graphical Mutual Information) Graph Representation Learning via Graphical Mutual Information Maximization (Peng Z, Huang W, Luo M, et al., WWW 20

93 Dec 29, 2022
For auto aligning, cropping, and scaling HR and LR images for training image based neural networks

ImgAlign For auto aligning, cropping, and scaling HR and LR images for training image based neural networks Usage Make sure OpenCV is installed, 'pip

15 Dec 04, 2022
Segcache: a memory-efficient and scalable in-memory key-value cache for small objects

Segcache: a memory-efficient and scalable in-memory key-value cache for small objects This repo contains the code of Segcache described in the followi

TheSys Group @ CMU CS 78 Jan 07, 2023
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech Jaehyeon Kim, Jungil Kong, and Juhee Son In our rece

Jaehyeon Kim 1.7k Jan 08, 2023
Algorithm to texture 3D reconstructions from multi-view stereo images

MVS-Texturing Welcome to our project that textures 3D reconstructions from images. This project focuses on 3D reconstructions generated using structur

Nils Moehrle 766 Jan 04, 2023
FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data

FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data. Flexible EM-Inspired Discriminant Analysis is a robust supervised classification algorithm that performs well i

0 Sep 06, 2022
ByteTrack with ReID module following the paradigm of FairMOT, tracking strategy is borrowed from FairMOT/JDE.

ByteTrack_ReID ByteTrack is the SOTA tracker in MOT benchmarks with strong detector YOLOX and a simple association strategy only based on motion infor

Han GuangXin 46 Dec 29, 2022
The full training script for Enformer (Tensorflow Sonnet) on TPU clusters

Enformer TPU training script (wip) The full training script for Enformer (Tensorflow Sonnet) on TPU clusters, in an effort to migrate the model to pyt

Phil Wang 10 Oct 19, 2022