Code of paper "CDFI: Compression-Driven Network Design for Frame Interpolation", CVPR 2021

Related tags

Deep LearningCDFI
Overview

CDFI (Compression-Driven-Frame-Interpolation)

[Paper] (Coming soon...) | [arXiv]

Tianyu Ding*, Luming Liang*, Zhihui Zhu, Ilya Zharkov

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021

Introduction

We propose a Compression-Driven network design for Frame Interpolation (CDFI), that leverages model compression to significantly reduce the model size (allows a better understanding of the current architecture) while making room for further improvements and achieving superior performance in the end. Concretely, we first compress AdaCoF and show that a 10X compressed AdaCoF performs similarly as its original counterpart; then we improve upon this compressed model with simple modifications. Note that typically it is prohibitive to implement the same improvements on the original heavy model.

  • We achieve a significant performance gain with only a quarter in size compared with the original AdaCoF

    Vimeo-90K Middlebury UCF101-DVF #Params
    PSNR, SSIM, LPIPS PSNR, SSIM, LPIPS PSNR, SSIM, LPIPS
    AdaCoF 34.38, 0.974, 0.019 35.74, 0.979, 0.019 35.20, 0.967, 0.019 21.8M
    Compressed AdaCoF 34.15, 0.973, 0.020 35.46, 0.978, 0.019 35.14, 0.967, 0.019 2.45M
    AdaCoF+ 34.58, 0.975, 0.018 36.12, 0.981, 0.017 35.19, 0.967, 0.019 22.9M
    Compressed AdaCoF+ 34.46, 0.975, 0.019 35.76, 0.979, 0.019 35.16, 0.967, 0.019 2.56M
    Our Final Model 35.19, 0.978, 0.010 37.17, 0.983, 0.008 35.24, 0.967, 0.015 4.98M
  • Our final model also performs favorably against other state-of-the-arts (details refer to our paper)

  • The proposed framework is generic and can be easily transferred to other DNN-based frame interpolation method

The above GIF is a demo of using our method to generate slow motion video, which increases the FPS from 5 to 160. We also provide a long video demonstration here (redirect to YouTube).

Environment

  • CUDA 11.0

  • python 3.8.3

  • torch 1.6.0

  • torchvision 0.7.0

  • cupy 7.7.0

  • scipy 1.5.2

  • numpy 1.19.1

  • Pillow 7.2.0

  • scikit-image 0.17.2

Test Pre-trained Models

Download repository:

$ git clone https://github.com/tding1/CDFI.git
$ cd CDFI/

Testing data

For user convenience, we already provide the Middlebury and UCF101-DVF test datasets in our repository, which can be found under directory test_data/.

Evaluation metrics

We use the built-in functions in skimage.metrics to compute the PSNR and SSIM, for which the higher the better. We also use LPIPS, a newly proposed metric that measures perceptual similarity, for which the smaller the better. For user convenience, we include the implementation of LPIPS in our repo under lpips_pytorch/, which is a slightly modified version of here (with an updated squeezenet backbone).

Test our pre-trained CDFI model

$ python test.py --gpu_id 0

By default, it will load our pre-trained model checkpoints/CDFI_adacof.pth. It will print the quantitative results on both Middlebury and UCF101-DVF, and the interpolated images will be saved under test_output/cdfi_adacof/.

Test the compressed AdaCoF

$ python test_compressed_adacof.py --gpu_id 0 --kernel_size 5 --dilation 1

By default, it will load the compressed AdaCoF model checkpoints/compressed_adacof_F_5_D_1.pth. It will print the quantitative results on both Middlebury and UCF101-DVF, and the interpolated images will be saved under test_output/compressed_adacof_F_5_D_1/.

Test the compressed AdaCoF+

$ python test_compressed_adacof.py --gpu_id 0 --kernel_size 11 --dilation 2

By default, it will load the compressed AdaCoF+ model checkpoints/compressed_adacof_F_11_D_2.pth. It will print the quantitative results on both Middlebury and UCF101-DVF, and the interpolated images will be saved under test_output/compressed_adacof_F_11_D_2/.

Interpolate two frames

$ python interpolate_twoframe.py --gpu_id 0 --first_frame figs/0.png --second_frame figs/1.png --output_frame output.png

By default, it will load our pre-trained model checkpoints/CDFI_adacof.pth, and generate the intermediate frame output.png given two consecutive frames in a sequence.

Train Our Model

Training data

We use the Vimeo-90K triplet dataset for video frame interpolation task, which is relatively large (>32 GB).

$ wget http://data.csail.mit.edu/tofu/dataset/vimeo_triplet.zip
$ unzip vimeo_triplet.zip
$ rm vimeo_triplet.zip

Start training

$ python train.py --gpu_id 0 --data_dir path/to/vimeo_triplet/ --batch_size 8

It will generate an unique ID for each training, and all the intermediate results/records will be saved under model_weights/<training id>/. For a GPU device with memory around 10GB, the --batch_size can take a value as large as 3, otherwise CUDA may be out of memory. There are many other training options, e.g., --lr, --epochs, --loss and so on, can be found in train.py.

Apply CDFI to New Models

One nice thing about CDFI is that the framework can be easily applied to other (heavy) DNN models and potentially boost their performance. The key to CDFI is the optimization-based compression that compresses a model via fine-grained pruning. In particular, we use the efficient and easy-to-use sparsity-inducing optimizer OBPROXSG (see also paper), and summarize the compression procedure for any other model in the following.

  • Copy the OBPROXSG optimizer, which is already implemented as torch.optim.optimizer, to your working directory
  • Starting from a pre-trained model, finetune its weights by using the OBPROXSG optimizer, like using any standard PyTorch built-in optimizer such as SGD or Adam
    • It is not necessarily to use the full dataset for this finetuning process
  • The parameters for the OBPROXSG optimizer
    • lr: learning rate
    • lambda_: coefficient of the L1 regularization term
    • epochSize: number of batches in a epoch
    • Np: number of proximal steps, which is set to be 2 for pruning AdaCoF
    • No: number of orthant steps (key step to promote sparsity), for which we recommend using the default setting
    • eps: threshold for trimming zeros, which is set to be 0.0001 for pruning AdaCoF
  • After the optimization is done (either by reaching a maximum number of epochs or achieving a high sparsity), use the layer density as the compression ratio for that layer, as described in the paper
  • As an example, compare the architectures in models/adacof.py and model/compressed_adacof.py for compressing AdaCoF with the above procedure

Now it's ready to make further improvements/modifications on the compressed model, based on the understanding of its flaws/drawbacks.

Citation

Coming soon...

Acknowledgements

The code is largely based on HyeongminLEE/AdaCoF-pytorch and baowenbo/DAIN.

Owner
Tianyu Ding
Ph.D. in Applied Mathematics \\ Master in Computer Science
Tianyu Ding
(CVPR 2022 - oral) Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry Official implementation of the paper Multi-View Depth Est

Bae, Gwangbin 138 Dec 28, 2022
Code for our CVPR2021 paper coordinate attention

Coordinate Attention for Efficient Mobile Network Design (preprint) This repository is a PyTorch implementation of our coordinate attention (will appe

Qibin (Andrew) Hou 726 Jan 05, 2023
A Multi-modal Model Chinese Spell Checker Released on ACL2021.

ReaLiSe ReaLiSe is a multi-modal Chinese spell checking model. This the office code for the paper Read, Listen, and See: Leveraging Multimodal Informa

DaDa 106 Dec 29, 2022
This is just a funny project that we want to see AutoEncoder (AE) can actually work to enhance the features we want

Funny_muscle_enhancer :) 1.Discription: This is just a funny project that we want to see AutoEncoder (AE) can actually work on the some features. We w

Jing-Yao Chen (Jacob) 8 Oct 01, 2022
Based on the paper "Geometry-aware Instance-reweighted Adversarial Training" ICLR 2021 oral

Geometry-aware Instance-reweighted Adversarial Training This repository provides codes for Geometry-aware Instance-reweighted Adversarial Training (ht

Jingfeng 47 Dec 22, 2022
Registration Loss Learning for Deep Probabilistic Point Set Registration

RLLReg This repository contains a Pytorch implementation of the point set registration method RLLReg. Details about the method can be found in the 3DV

Felix Järemo Lawin 35 Nov 02, 2022
Official implementation of the paper Visual Parser: Representing Part-whole Hierarchies with Transformers

Visual Parser (ViP) This is the official implementation of the paper Visual Parser: Representing Part-whole Hierarchies with Transformers. Key Feature

Shuyang Sun 117 Dec 11, 2022
Gif-caption - A straightforward GIF Captioner written in Python

Broksy's GIF Captioner Have you ever wanted to easily caption a GIF without havi

3 Apr 09, 2022
Revisiting, benchmarking, and refining Heterogeneous Graph Neural Networks.

Heterogeneous Graph Benchmark Revisiting, benchmarking, and refining Heterogeneous Graph Neural Networks. Roadmap We organize our repo by task, and on

THUDM 176 Dec 17, 2022
Ray tracing of a Schwarzschild black hole written entirely in TensorFlow.

TensorGeodesic Ray tracing of a Schwarzschild black hole written entirely in TensorFlow. Dependencies: Python 3 TensorFlow 2.x numpy matplotlib About

5 Jan 15, 2022
A library for performing coverage guided fuzzing of neural networks

TensorFuzz: Coverage Guided Fuzzing for Neural Networks This repository contains a library for performing coverage guided fuzzing of neural networks,

Brain Research 195 Dec 28, 2022
Reinforcement-learning - Repository of the class assignment questions for the course on reinforcement learning

DSE 314/614: Reinforcement Learning This repository containing reinforcement lea

Manav Mishra 4 Apr 15, 2022
Automated Attendance Project Using Face Recognition

dependencies for project: cmake 3.22.1 dlib 19.22.1 face-recognition 1.3.0 openc

Rohail Taha 1 Jan 09, 2022
The official PyTorch implementation for the paper "sMGC: A Complex-Valued Graph Convolutional Network via Magnetic Laplacian for Directed Graphs".

Magnetic Graph Convolutional Networks About The official PyTorch implementation for the paper sMGC: A Complex-Valued Graph Convolutional Network via M

3 Feb 25, 2022
AI Summer's complete catalog of articles

Learn Deep Learning with AI Summer A collection of all articles (almost 100) written for the AI Summer blog organized by topic. Deep Learning Theory M

AI Summer 95 Dec 29, 2022
CS5242_2021 - Neural Networks and Deep Learning, NUS CS5242, 2021

CS5242_2021 Neural Networks and Deep Learning, NUS CS5242, 2021 Cloud Machine #1 : Google Colab (Free GPU) Follow this Notebook installation : https:/

Xavier Bresson 165 Oct 25, 2022
Wordle-solver - Wordle answer generation program in python

🟨 Wordle Solver 🟩 Wordle answer generation program in python ✔️ Requirements U

Dahyun Kang 4 May 28, 2022
Reverse engineer your pytorch vision models, in style

🔍 Rover Reverse engineer your CNNs, in style Rover will help you break down your CNN and visualize the features from within the model. No need to wri

Mayukh Deb 32 Sep 24, 2022
Collections for the lasted paper about multi-view clustering methods (papers, codes)

Multi-View Clustering Papers Collections for the lasted paper about multi-view clustering methods (papers, codes). There also exists some repositories

Andrew Guan 10 Sep 20, 2022
An end-to-end regression problem of predicting the price of properties in Bangalore.

Bangalore-House-Price-Prediction An end-to-end regression problem of predicting the price of properties in Bangalore. Deployed in Heroku using Flask.

Shruti Balan 1 Nov 25, 2022