Image super-resolution (SR) is a fast-moving field with novel architectures attracting the spotlight

Related tags

Deep Learningrcan-it
Overview

Revisiting RCAN: Improved Training for Image Super-Resolution

Introduction

Image super-resolution (SR) is a fast-moving field with novel architectures attracting the spotlight. However, most SR models were optimized with dated training strategies. In this work, we revisit the popular RCAN model and examine the effect of different training options in SR. Surprisingly (or perhaps as expected), we show that RCAN can outperform or match nearly all the CNN-based SR architectures published after RCAN on standard benchmarks with a proper training strategy and minimal architecture change. Besides, although RCAN is a very large SR architecture with more than four hundred convolutional layers, we draw a notable conclusion that underfitting is still the main problem restricting the model capability instead of overfitting. We observe supportive evidence that increasing training iterations clearly improves the model performance while applying regularization techniques generally degrades the predictions. We denote our simply revised RCAN as RCAN-it and recommend practitioners to use it as baselines for future research. Please check our pre-print for more information.

Environment Setup

Create a new conda environment and install PyTorch:

conda create -n ptsr python=3.8 numpy
conda activate ptsr
conda install pytorch==1.9.0 torchvision==0.10.0 cudatoolkit=11.1 -c pytorch -c nvidia

Install the required packages:

git clone https://github.com/zudi-lin/rcan-it.git
cd rcan-it
pip install --editable .

Our package is called ptsr, abbreviating A PyTorch Framework for Image Super-Resolution. Then run tests to validate the installation:

python -m unittest discover -b tests

Multi-processing Distributed Data Parallel Training

For different hardware conditions, please first update the config files accordingly. Even for single-node single-GPU training, we use distributed data parallel (DDP) for consistency.

Single Node

Single GPU training:

CUDA_VISIBLE_DEVICES=0 python -u -m torch.distributed.run --nproc_per_node=1 \
--master_port=9988 main.py --distributed --config-base configs/RCAN/RCAN_Improved.yaml \
--config-file configs/RCAN/RCAN_x2.yaml

Single node with multiple (e.g., 4) GPUs:

CUDA_VISIBLE_DEVICES=0,1,2,3 python -u -m torch.distributed.run --nproc_per_node=4 \
--master_port=9977 main.py --distributed --config-base configs/RCAN/RCAN_Improved.yaml \
--config-file configs/RCAN/RCAN_x2.yaml

By default the configuration file, model checkpoints and validation curve will be saved under outputs/, which is added to .gitignore and will be untracked by Git.

Multiple Nodes

After activating the virtual environment with PyTorch>=1.9.0, run hostname -I | awk '{print $1}' to get the ip address of the master node. Suppose the master ip address is 10.31.133.85, and we want to train the model on two nodes with multiple GPUs, then the commands are:

Node 0 (master node):

CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --nproc_per_node=2 --nnodes=2 \ 
--node_rank=0 --master_addr="10.31.133.85" --master_port=9922 main.py --distributed \
--config-base configs/RCAN/RCAN_Improved.yaml --config-file configs/RCAN/RCAN_x2.yaml

Node 1:

CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --nproc_per_node=2 --nnodes=2 \ 
--node_rank=1 --master_addr="10.31.133.85" --master_port=9922 main.py --distributed \
--config-base configs/RCAN/RCAN_Improved.yaml --config-file configs/RCAN/RCAN_x2.yaml

Description of the options:

  • --nproc_per_node: number of processes on each node. Set this to the number of GPUs on the node to maximize the training efficiency.
  • --nnodes: total number of nodes for training.
  • --node_rank: rank of the current node within all nodes.
  • --master_addr: the ip address of the master (rank 0) node.
  • --master_port: a free port to communicate with the master node.
  • --distributed: multi-processing Distributed Data Parallel (DDP) training.
  • --local_world_size: number of GPUs on the current node.

For a system with Slurm Workload Manager, please load required modules: module load cuda cudnn.

Data Parallel Training

Data Parallel training only works on single node with one or multiple GPUs. Different from the DDP scheme, it will create only one process. Single GPU training:

CUDA_VISIBLE_DEVICES=0 python main.py --config-base configs/RCAN/RCAN_Base.yaml \
--config-file configs/RCAN/RCAN_x2.yaml

Single node with multiple (e.g., 4) GPUs:

CUDA_VISIBLE_DEVICES=0,1,2,3 python main.py --config-base configs/RCAN/RCAN_Base.yaml \
--config-file configs/RCAN/RCAN_x2.yaml

Citation

Please check this pre-print for details. If you find this work useful for your research, please cite:

@misc{lin2022revisiting,
      title={Revisiting RCAN: Improved Training for Image Super-Resolution}, 
      author={Zudi Lin and Prateek Garg and Atmadeep Banerjee and Salma Abdel Magid and Deqing Sun and Yulun Zhang and Luc Van Gool and Donglai Wei and Hanspeter Pfister},
      year={2022},
      eprint={2201.11279},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
Owner
Zudi Lin
CS Ph.D. student at Harvard
Zudi Lin
Code for PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning

PackNet: https://arxiv.org/abs/1711.05769 Pretrained models are available here: https://uofi.box.com/s/zap2p03tnst9dfisad4u0sfupc0y1fxt Datasets in Py

Arun Mallya 216 Jan 05, 2023
Official PyTorch implementation of "Improving Face Recognition with Large AgeGaps by Learning to Distinguish Children" (BMVC 2021)

Inter-Prototype (BMVC 2021): Official Project Webpage This repository provides the official PyTorch implementation of the following paper: Improving F

Jungsoo Lee 16 Jun 30, 2022
Colab notebook and additional materials for Python-driven analysis of redlining data in Philadelphia

RedliningExploration The Google Colaboratory file contained in this repository contains work inspired by a project on educational inequality in the Ph

Benjamin Warren 1 Jan 20, 2022
A PyTorch implementation of ViTGAN based on paper ViTGAN: Training GANs with Vision Transformers.

ViTGAN: Training GANs with Vision Transformers A PyTorch implementation of ViTGAN based on paper ViTGAN: Training GANs with Vision Transformers. Refer

Hong-Jia Chen 127 Dec 23, 2022
Pytorch domain adaptation package

DomainAdaptation This package is created to tackle the problem of domain shifts when dealing with two domains of different feature distributions. In d

Institute of Computational Perception 7 Oct 22, 2022
The materials used in the SaxonJS tutorial presented at Declarative Amsterdam, 2021

SaxonJS-Tutorial-2021, version 1.0.4 Last updated on 4 November, 2021. Table of contents Background Prerequisites Starting a web server Running a Java

Saxonica 11 Oct 23, 2022
Facial expression detector

A tensorflow convolutional neural network model to detect facial expressions.

Carlos Tardón Rubio 5 Apr 20, 2022
PyTorch implementation for 3D human pose estimation

Towards 3D Human Pose Estimation in the Wild: a Weakly-supervised Approach This repository is the PyTorch implementation for the network presented in:

Xingyi Zhou 579 Dec 22, 2022
Video Matting Refinement For Python

Video-matting refinement Library (use pip to install) scikit-image numpy av matplotlib Run Static background python path_to_video.mp4 Moving backgroun

3 Jan 11, 2022
Deep Learning for Natural Language Processing SS 2021 (TU Darmstadt)

Deep Learning for Natural Language Processing SS 2021 (TU Darmstadt) Task Training huge unsupervised deep neural networks yields to strong progress in

2 Aug 05, 2022
TabNet for fastai

TabNet for fastai This is an adaptation of TabNet (Attention-based network for tabular data) for fastai (=2.0) library. The original paper https://ar

Mikhail Grankin 116 Oct 21, 2022
DiAne is a smart fuzzer for IoT devices

Diane Diane is a fuzzer for IoT devices. Diane works by identifying fuzzing triggers in the IoT companion apps to produce valid yet under-constrained

seclab 28 Jan 04, 2023
Wav2Vec for speech recognition, classification, and audio classification

Soxan در زبان پارسی به نام سخن This repository consists of models, scripts, and notebooks that help you to use all the benefits of Wav2Vec 2.0 in your

Mehrdad Farahani 140 Dec 15, 2022
Unsupervised Image to Image Translation with Generative Adversarial Networks

Unsupervised Image to Image Translation with Generative Adversarial Networks Paper: Unsupervised Image to Image Translation with Generative Adversaria

Hao 71 Oct 30, 2022
A supplementary code for Editable Neural Networks, an ICLR 2020 submission.

Editable neural networks A supplementary code for Editable Neural Networks, an ICLR 2020 submission by Anton Sinitsin, Vsevolod Plokhotnyuk, Dmitry Py

Anton Sinitsin 32 Nov 29, 2022
1st Solution For ICDAR 2021 Competition on Mathematical Formula Detection

This project releases our 1st place solution on ICDAR 2021 Competition on Mathematical Formula Detection. We implement our solution based on MMDetection, which is an open source object detection tool

yuxzho 94 Dec 25, 2022
Simple PyTorch implementations of Badnets on MNIST and CIFAR10.

Simple PyTorch implementations of Badnets on MNIST and CIFAR10.

Vera 75 Dec 13, 2022
My implementation of DeepMind's Perceiver

DeepMind Perceiver (in PyTorch) Disclaimer: This is not official and I'm not affiliated with DeepMind. My implementation of the Perceiver: General Per

Louis Arge 55 Dec 12, 2022
The DL Streamer Pipeline Zoo is a catalog of optimized media and media analytics pipelines.

The DL Streamer Pipeline Zoo is a catalog of optimized media and media analytics pipelines. It includes tools for downloading pipelines and their dependencies and tools for measuring their performace

8 Dec 04, 2022
This is the code repository implementing the paper "TreePartNet: Neural Decomposition of Point Clouds for 3D Tree Reconstruction".

TreePartNet This is the code repository implementing the paper "TreePartNet: Neural Decomposition of Point Clouds for 3D Tree Reconstruction". Depende

刘彦超 34 Nov 30, 2022