Official PyTorch Implementation for "Recurrent Video Deblurring with Blur-Invariant Motion Estimation and Pixel Volumes"

Overview

PVDNet: Recurrent Video Deblurring with Blur-Invariant Motion Estimation and Pixel Volumes

License CC BY-NC

This repository contains the official PyTorch implementation of the following paper:

Recurrent Video Deblurring with Blur-Invariant Motion Estimation and Pixel Volumes
Hyeongseok Son, Junyong Lee, Jonghyeop Lee, Sunghyun Cho, Seungyong Lee, TOG 2021 (presented at SIGGRAPH 2021)

About the Research

Click here

Overall Framework

Our video deblurring framework consists of three modules: a blur-invariant motion estimation network (BIMNet), a pixel volume generator, and a pixel volume-based deblurring network (PVDNet). We first train BIMNet; after it has converged, we combine the two networks with the pixel volume generator. We then fix the parameters of BIMNet and train PVDNet by training the entire network.

Blur-Invariant Motion Estimation Network (BIMNet)

To estimate motion between frames accurately, we adopt LiteFlowNet and train it with a blur-invariant loss so that the trained network can estimate blur-invariant optical flow between frames. We train BIMNet with a blur-invariant loss , which is defined as (refer Eq. 1 in the main paper):

The figure shows a qualitative comparison of different optical flow methods. The results of the other methods contain severely distorted structures due to errors in their optical flow maps. In contrast, the results of BIMNets show much less distortions.

Pixel Volume for Motion Compensation

We propose a novel pixel volume that provides multiple candidates for matching pixels between images. Moreover, a pixel volume provides an additional cue for motion compensation based on the majority.

Our pixel volume approach leads to the performance improvement of video deblurring by utilizing the multiple candidates in a pixel volume in two aspects: 1) in most cases, the majority cue for the correct match would help as the statistics (Sec. 4.4 in the main paper) shows, and 2) in other cases, PVDNet would exploit multiple candidates to estimate the correct match referring to nearby pixels with majority cues.

Getting Started

Prerequisites

Tested environment

Ubuntu18.04 Python 3.8.8 PyTorch 1.8.0 CUDA 10.2

  1. Environment setup

    $ git clone https://github.com/codeslake/PVDNet.git
    $ cd PVDNet
    
    $ conda create -y --name PVDNet python=3.8 && conda activate PVDNet
    # for CUDA10.2
    $ sh install_CUDA10.2.sh
    # for CUDA11.1
    $ sh install_CUDA11.1.sh
  2. Datasets

    • Download and unzip Su et al.'s dataset and Nah et al.'s dataset under [DATASET_ROOT]:

      ├── [DATASET_ROOT]
      │   ├── train_DVD
      │   ├── test_DVD
      │   ├── train_nah
      │   ├── test_nah
      

      Note:

      • [DATASET_ROOT] is currently set to ./datasets/video_deblur. It can be specified by modifying config.data_offset in ./configs/config.py.
  3. Pre-trained models

    • Download and unzip pretrained weights under ./ckpt/:

      ├── ./ckpt
      │   ├── BIMNet.pytorch
      │   ├── PVDNet_DVD.pytorch
      │   ├── PVDNet_nah.pytorch
      │   ├── PVDNet_large_nah.pytorch
      

Testing models of TOG2021

For PSNRs and SSIMs reported in the paper, we use the approach of Koehler et al. following Su et al., that first aligns two images using global translation to represent the ambiguity in the pixel location caused by blur.
Refer here for the evaluation code.

## Table 4 in the main paper (Evaluation on Su etal's dataset)
# Our final model 
CUDA_VISIBLE_DEVICES=0 python run.py --mode PVDNet_DVD --config config_PVDNet --data DVD --ckpt_abs_name ckpt/PVDNet_DVD.pytorch

## Table 5 in the main paper (Evaluation on Nah etal's dataset)
# Our final model 
CUDA_VISIBLE_DEVICES=0 python run.py --mode PVDNet_nah --config config_PVDNet --data nah --ckpt_abs_name ckpt/PVDNet_nah.pytorch

# Larger model
CUDA_VISIBLE_DEVICES=0 python run.py --mode PVDNet_large_nah --config config_PVDNet_large --data nah --ckpt_abs_name ckpt/PVDNet_large_nah.pytorch

Note:

  • Testing results will be saved in [LOG_ROOT]/PVDNet_TOG2021/[mode]/result/quanti_quali/[mode]_[epoch]/[data]/.
  • [LOG_ROOT] is set to ./logs/ by default. Refer here for more details about the logging.
  • options
    • --data: The name of a dataset to evaluate: DVD | nah | random. Default: DVD
      • The data structure can be modified in the function set_eval_path(..) in ./configs/config.py.
      • random is for testing models with any video frames, which should be placed as [DATASET_ROOT]/random/[video_name]/*.[jpg|png].

Wiki

Citation

If you find this code useful, please consider citing:

@artical{Son_2021_TOG,
    author = {Son, Hyeongseok and Lee, Junyong and Lee, Jonghyeop and Cho, Sunghyun and Lee, Seungyong},
    title = {Recurrent Video Deblurring with Blur-Invariant Motion Estimation and Pixel Volumes},
    journal = {ACM Transactions on Graphics},
    year = {2021}
}

Contact

Open an issue for any inquiries. You may also have contact with [email protected] or [email protected]

Resources

All material related to our paper is available by following links:

Link
The main paper
arXiv
Supplementary Files
Checkpoint Files
Su et al [2017]'s dataset (reference)
Nah et al. [2017]'s dataset (reference)

License

This software is being made available under the terms in the LICENSE file.

Any exemptions to these terms require a license from the Pohang University of Science and Technology.

About Coupe Project

Project ‘COUPE’ aims to develop software that evaluates and improves the quality of images and videos based on big visual data. To achieve the goal, we extract sharpness, color, composition features from images and develop technologies for restoring and improving by using them. In addition, personalization technology through user reference analysis is under study.

Please check out other Coupe repositories in our Posgraph github organization.

Useful Links

Owner
Junyong Lee
Ph.D candidate at POSTECH
Junyong Lee
RoMA: Robust Model Adaptation for Offline Model-based Optimization

RoMA: Robust Model Adaptation for Offline Model-based Optimization Implementation of RoMA: Robust Model Adaptation for Offline Model-based Optimizatio

9 Oct 31, 2022
A PyTorch implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis

WaveGlow A PyTorch implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis Quick Start: Install requirements: pip install

Yuchao Zhang 204 Jul 14, 2022
HGCAE Pytorch implementation. CVPR2021 accepted.

Hyperbolic Graph Convolutional Auto-Encoders Accepted to CVPR2021 🎉 Official PyTorch code of Unsupervised Hyperbolic Representation Learning via Mess

Junho Cho 37 Nov 13, 2022
This repository contains the source code for the paper "DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks",

DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks Project Page | Video | Presentation | Paper | Data L

Facebook Research 281 Dec 22, 2022
EdMIPS: Rethinking Differentiable Search for Mixed-Precision Neural Networks

EdMIPS is an efficient algorithm to search the optimal mixed-precision neural network directly without proxy task on ImageNet given computation budgets. It can be applied to many popular network arch

Zhaowei Cai 47 Dec 30, 2022
Efficient Deep Learning Systems course

Efficient Deep Learning Systems This repository contains materials for the Efficient Deep Learning Systems course taught at the Faculty of Computer Sc

Max Ryabinin 173 Dec 29, 2022
Second-Order Neural ODE Optimizer, NeurIPS 2021 spotlight

Second-order Neural ODE Optimizer (NeurIPS 2021 Spotlight) [arXiv] ✔️ faster convergence in wall-clock time | ✔️ O(1) memory cost | ✔️ better test-tim

Guan-Horng Liu 39 Oct 22, 2022
This GitHub repo consists of Code and Some results of project- Diabetes Treatment using Gold nanoparticles. These Consist of ML Models used for prediction Diabetes and further the basic theory and working of Gold nanoparticles.

GoldNanoparticles This GitHub repo consists of Code and Some results of project- Diabetes Treatment using Gold nanoparticles. These Consist of ML Mode

1 Jan 30, 2022
Face recognize system

FRS Face_recognize_system This project contains my work that target on solving some problems of FRS: Face detection: Retinaface Face anti-spoofing: Fo

Tran Anh Tuan 4 Nov 18, 2021
Explainer for black box models that predict molecule properties

Explaining why that molecule exmol is a package to explain black-box predictions of molecules. The package uses model agnostic explanations to help us

White Laboratory 172 Dec 19, 2022
object detection; robust detection; ACM MM21 grand challenge; Security AI Challenger Phase VII

赛题背景 在商品知识产权领域,知识产权体现为在线商品的设计和品牌。不幸的是,在每一天,存在着非法商户通过一些对抗手段干扰商标识别来逃避侵权,这带来了很高的知识产权风险和财务损失。为了促进先进的多媒体人工智能技术的发展,以保护企业来之不易的创作和想法免受恶意使用和剽窃,因此提出了鲁棒性标识检测挑战赛

65 Dec 22, 2022
Full Stack Deep Learning Labs

Full Stack Deep Learning Labs Welcome! Project developed during lab sessions of the Full Stack Deep Learning Bootcamp. We will build a handwriting rec

Full Stack Deep Learning 1.2k Dec 31, 2022
Multi-Scale Aligned Distillation for Low-Resolution Detection (CVPR2021)

MSAD Multi-Scale Aligned Distillation for Low-Resolution Detection Lu Qi*, Jason Kuen*, Jiuxiang Gu, Zhe Lin, Yi Wang, Yukang Chen, Yanwei Li, Jiaya J

Jia Research Lab 115 Dec 23, 2022
Doing the asl sign language classification on static images using graph neural networks.

SignLangGNN When GNNs 💜 MediaPipe. This is a starter project where I tried to implement some traditional image classification problem i.e. the ASL si

10 Nov 09, 2022
Pytorch library for fast transformer implementations

Transformers are very successful models that achieve state of the art performance in many natural language tasks

Idiap Research Institute 1.3k Dec 30, 2022
[AAAI 2021] EMLight: Lighting Estimation via Spherical Distribution Approximation and [ICCV 2021] Sparse Needlets for Lighting Estimation with Spherical Transport Loss

EMLight: Lighting Estimation via Spherical Distribution Approximation (AAAI 2021) Update 12/2021: We release our Virtual Object Relighting (VOR) Datas

Fangneng Zhan 144 Jan 06, 2023
A decent AI that solves daily Wordle puzzles. Works with different websites with similar wordlists,.

Wordle-AI A decent AI that solves daily "Wordle" puzzles. Works with different websites with similar wordlists. When prompted with "Word:" enter the w

Ethan 1 Feb 10, 2022
Object Tracking and Detection Using OpenCV

Object tracking is one such application of computer vision where an object is detected in a video, otherwise interpreted as a set of frames, and the object’s trajectory is estimated. For instance, yo

Happy N. Monday 4 Aug 21, 2022
yufan 81 Dec 08, 2022
Research on Tabular Deep Learning (Python package & papers)

Research on Tabular Deep Learning For paper implementations, see the section "Papers and projects". rtdl is a PyTorch-based package providing a user-f

Yura Gorishniy 510 Dec 30, 2022