Official PyTorch code for Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution (MANet, ICCV2021)

Overview

Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution (MANet, ICCV2021)

This repository is the official PyTorch implementation of Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution (arxiv, supplementary).

🚀 🚀 🚀 News:


Existing blind image super-resolution (SR) methods mostly assume blur kernels are spatially invariant across the whole image. However, such an assumption is rarely applicable for real images whose blur kernels are usually spatially variant due to factors such as object motion and out-of-focus. Hence, existing blind SR methods would inevitably give rise to poor performance in real applications. To address this issue, this paper proposes a mutual affine network (MANet) for spatially variant kernel estimation. Specifically, MANet has two distinctive features. First, it has a moderate receptive field so as to keep the locality of degradation. Second, it involves a new mutual affine convolution (MAConv) layer that enhances feature expressiveness without increasing receptive field, model size and computation burden. This is made possible through exploiting channel interdependence, which applies each channel split with an affine transformation module whose input are the rest channel splits. Extensive experiments on synthetic and real images show that the proposed MANet not only performs favorably for both spatially variant and invariant kernel estimation, but also leads to state-of-the-art blind SR performance when combined with non-blind SR methods.

Requirements

  • Python 3.7, PyTorch >= 1.6, scipy >= 1.6.3
  • Requirements: opencv-python
  • Platforms: Ubuntu 16.04, cuda-10.0 & cuDNN v-7.5

Note: this repository is based on BasicSR. Please refer to their repository for a better understanding of the code framework.

Quick Run

Download stage3_MANet+RRDB_x4.pth from release and put it in ./pretrained_models. Then, run this command:

cd codes
python test.py --opt options/test/test_stage3.yml

Data Preparation

To prepare data, put training and testing sets in ./datasets as ./datasets/DIV2K/HR/0801.png. Commonly used datasets can be downloaded here.

Training

Step1: to train MANet, run this command:

python train.py --opt options/train/train_stage1.yml

Step2: to train non-blind RRDB, run this command:

python train.py --opt options/train/train_stage2.yml

Step3: to fine-tune RRDB with MANet, run this command:

python train.py --opt options/train/train_stage3.yml

All trained models can be downloaded from release. For testing, downloading stage3 models is enough.

Testing

To test MANet (stage1, kernel estimation only), run this command:

python test.py --opt options/test/test_stage1.yml

To test RRDB-SFT (stage2, non-blind SR with ground-truth kernel), run this command:

python test.py --opt options/test/test_stage2.yml

To test MANet+RRDB (stage3, blind SR), run this command:

python test.py --opt options/test/test_stage3.yml

Note: above commands generate LR images on-the-fly. To generate testing sets used in the paper, run this command:

python prepare_testset.py --opt options/test/prepare_testset.yml

Interactive Exploration of Kernels

To explore spaitally variant kernels on an image, use --save_kernel and run this command to save kernel:

python test.py --opt options/test/test_stage1.yml --save_kernel

Then, run this command to creat an interactive window:

python interactive_explore.py --path ../results/001_MANet_aniso_x4_test_stage1/toy_dataset1/npz/toy1.npz

Results

We conducted experiments on both spatially variant and invariant blind SR. Please refer to the paper and supp for results.

Citation

@inproceedings{liang21manet,
  title={Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution},
  author={Liang, Jingyun and Sun, Guolei and Zhang, Kai and Van Gool, Luc and Timofte, Radu},
  booktitle={IEEE Conference on International Conference on Computer Vision},
  year={2021}
}

License & Acknowledgement

This project is released under the Apache 2.0 license. The codes are based on BasicSR, MMSR, IKC and KAIR. Please also follow their licenses. Thanks for their great works.

Comments
  • Training and OOM

    Training and OOM

    Thanks for your code. I tried to train the model with train_stage1.yml, and the Cuda OOM. I am using 2080 Ti, I tried to reduce the batch size from 16 to 2 and the GT_size from 192 to 48. However, the training still OOM. May I know is there anything I missed? Thanks.

    opened by hcleung3325 9
  • [How to get SR image by spatially variant estimated blur kernels]

    [How to get SR image by spatially variant estimated blur kernels]

    Hi, Thank you for your excellent and interesting work! I'm not so clear about the process after kernels estimation during SR reconstruction after reading your paper. Could you please explain?

    opened by CaptainEven 7
  • The method of creating kernels

    The method of creating kernels

    I noticed that the function for creating kernel ('anisotropic_gaussian_kernel_matlab') is different from the standard gaussian distribution (e.g. the method that used in IKC, https://github.com/yuanjunchai/IKC/blob/2a846cf1194cd9bace08973d55ecd8fd3179fe48/codes/utils/util.py#L244). I am wondering why a different way is used here. Actually, a test dataset created by IKC with same sigma range seems to have poor performance on MANet, and vice versa.

    opened by zhiqiangfu 3
  • [import error]

    [import error]

        k = scipy.stats.multivariate_normal.pdf(pos, mean=[0, 0], cov=cov)
    AttributeError: module 'scipy' has no attribute 'stats'
    

    scipy version error? So, which version of scipy is required?

    opened by CaptainEven 2
  • A letter from afar

    A letter from afar

    Good evening, boss! I recently discovered your work about MANet.I found that the length of the gaussian kernel your method generated is equal to 18.Does this setting have any specific meaning? image

    opened by fenghao195 0
  • New Super-Resolution Benchmarks

    New Super-Resolution Benchmarks

    Hello,

    MSU Graphics & Media Lab Video Group has recently launched two new Super-Resolution Benchmarks.

    If you are interested in participating, you can add your algorithm following the submission steps:

    We would be grateful for your feedback on our work!

    opened by EvgeneyBogatyrev 0
  • About LR_Image PSNR/SSIM

    About LR_Image PSNR/SSIM

    Many thanks for your excellent work!

    I wonder what is the LR_Image PSNR/SSIM in the ablation study to evaluate the MANet about kernel prediction, and how to compute these?

    opened by Shaosifan 0
  • Questions about the paper

    Questions about the paper

    Thanks again for your great work. I have several questions about the paper. In Figure 2, you mentioned the input for MANet is a LR, but the input for your code seems to be DIV2K GT. Is there any further process I miss? Also, is that possible for the whole model trained in y-channel since my deployed environment only deals with y-channel? Thanks.

    opened by mrgreen3325 0
  • Issue about class BatchBlur_SV in utils.util

    Issue about class BatchBlur_SV in utils.util

    MANet/codes/utils/util.py Line 661: kernel = kernel.flatten(2).unsqueeze(0).expand(3,-1,-1,-1) The kernel shape: [B, HW, l, l] ->[B, HW, l^2] ->[1, B, HW, l^2] ->[C, B, HW, l^2] I think it is wrong, because it is not corresponding to the shape of pad.

    The line 661 should be kernel = kernel.flatten(2).unsqueeze(1).expand(-1, 3,-1,-1) The kernel shape: [B, HW, l, l] ->[B, HW, l^2] ->[B, 1, HW, l^2] ->[B, C, HW, l^2]

    opened by jiangmengyu18 0
Owner
Jingyun Liang
PhD Student at Computer Vision Lab, ETH Zurich
Jingyun Liang
Large scale embeddings on a single machine.

Marius Marius is a system under active development for training embeddings for large-scale graphs on a single machine. Training on large scale graphs

Marius 107 Jan 03, 2023
Group R-CNN for Point-based Weakly Semi-supervised Object Detection (CVPR2022)

Group R-CNN for Point-based Weakly Semi-supervised Object Detection (CVPR2022) By Shilong Zhang*, Zhuoran Yu*, Liyang Liu*, Xinjiang Wang, Aojun Zhou,

Shilong Zhang 129 Dec 24, 2022
TeachMyAgent is a testbed platform for Automatic Curriculum Learning methods in Deep RL.

TeachMyAgent: a Benchmark for Automatic Curriculum Learning in Deep RL Paper Website Documentation TeachMyAgent is a testbed platform for Automatic Cu

Flowers Team 51 Dec 25, 2022
(CVPR2021) Kaleido-BERT: Vision-Language Pre-training on Fashion Domain

Kaleido-BERT: Vision-Language Pre-training on Fashion Domain Mingchen Zhuge*, Dehong Gao*, Deng-Ping Fan#, Linbo Jin, Ben Chen, Haoming Zhou, Minghui

248 Dec 04, 2022
Ganilla - Official Pytorch implementation of GANILLA

GANILLA We provide PyTorch implementation for: GANILLA: Generative Adversarial Networks for Image to Illustration Translation. Paper Arxiv Updates (Fe

Samet Hi 462 Dec 05, 2022
Axel - 3D printed robotic hands and they controll with Raspberry Pi and Arduino combo

Axel It's our graduation project about 3D printed robotic hands and they control

0 Feb 14, 2022
Implementation of Feedback Transformer in Pytorch

Feedback Transformer - Pytorch Simple implementation of Feedback Transformer in Pytorch. They improve on Transformer-XL by having each token have acce

Phil Wang 93 Oct 04, 2022
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)

English | 简体中文 Welcome to the PaddlePaddle GitHub. PaddlePaddle, as the only independent R&D deep learning platform in China, has been officially open

19.4k Jan 04, 2023
The best solution of the Weather Prediction track in the Yandex Shifts challenge

yandex-shifts-weather The repository contains information about my solution for the Weather Prediction track in the Yandex Shifts challenge https://re

Ivan Yu. Bondarenko 15 Dec 18, 2022
Athena is the only tool that you will ever need to optimize your portfolio.

Athena Portfolio optimization is the process of selecting the best portfolio (asset distribution), out of the set of all portfolios being considered,

Indrajit 1 Mar 25, 2022
NaturalProofs: Mathematical Theorem Proving in Natural Language

NaturalProofs: Mathematical Theorem Proving in Natural Language NaturalProofs: Mathematical Theorem Proving in Natural Language Sean Welleck, Jiacheng

Sean Welleck 83 Jan 05, 2023
This repo will contain code to reproduce and build upon understanding transfer learning

What is being transferred in transfer learning? This repo contains the code for the following paper: Behnam Neyshabur*, Hanie Sedghi*, Chiyuan Zhang*.

4 Jun 16, 2021
This repository provides a basic implementation of our GCPR 2021 paper "Learning Conditional Invariance through Cycle Consistency"

Learning Conditional Invariance through Cycle Consistency This repository provides a basic TensorFlow 1 implementation of the proposed model in our GC

BMDA - University of Basel 1 Nov 04, 2022
Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model in Tensorflow Lite.

TFLite-msg_chn_wacv20-depth-completion Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model

Ibai Gorordo 2 Oct 04, 2021
LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT

LightHuBERT LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT | Github | Huggingface | SUPER

WangRui 46 Dec 29, 2022
A diff tool for language models

LMdiff Qualitative comparison of large language models. Demo & Paper: http://lmdiff.net LMdiff is a MIT-IBM Watson AI Lab collaboration between: Hendr

Hendrik Strobelt 27 Dec 29, 2022
Cross-modal Retrieval using Transformer Encoder Reasoning Networks (TERN). With use of Metric Learning and FAISS for fast similarity search on GPU

Cross-modal Retrieval using Transformer Encoder Reasoning Networks This project reimplements the idea from "Transformer Reasoning Network for Image-Te

Minh-Khoi Pham 5 Nov 05, 2022
A embed able annotation tool for end to end cross document co-reference

CoRefi CoRefi is an emebedable web component and stand alone suite for exaughstive Within Document and Cross Document Coreference Anntoation. For a de

PythicCoder 39 Dec 12, 2022
Official implementation of "Refiner: Refining Self-attention for Vision Transformers".

RefinerViT This repo is the official implementation of "Refiner: Refining Self-attention for Vision Transformers". The repo is build on top of timm an

101 Dec 29, 2022
Official repository of IMPROVING DEEP IMAGE MATTING VIA LOCAL SMOOTHNESS ASSUMPTION.

IMPROVING DEEP IMAGE MATTING VIA LOCAL SMOOTHNESS ASSUMPTION This is the official repository of IMPROVING DEEP IMAGE MATTING VIA LOCAL SMOOTHNESS ASSU

电线杆 14 Dec 15, 2022