[ICCV 2021] Focal Frequency Loss for Image Reconstruction and Synthesis

Overview

Focal Frequency Loss - Official PyTorch Implementation

teaser

This repository provides the official PyTorch implementation for the following paper:

Focal Frequency Loss for Image Reconstruction and Synthesis
Liming Jiang, Bo Dai, Wayne Wu and Chen Change Loy
In ICCV 2021.
Project Page | Paper | Poster | Slides | YouTube Demo

Abstract: Image reconstruction and synthesis have witnessed remarkable progress thanks to the development of generative models. Nonetheless, gaps could still exist between the real and generated images, especially in the frequency domain. In this study, we show that narrowing gaps in the frequency domain can ameliorate image reconstruction and synthesis quality further. We propose a novel focal frequency loss, which allows a model to adaptively focus on frequency components that are hard to synthesize by down-weighting the easy ones. This objective function is complementary to existing spatial losses, offering great impedance against the loss of important frequency information due to the inherent bias of neural networks. We demonstrate the versatility and effectiveness of focal frequency loss to improve popular models, such as VAE, pix2pix, and SPADE, in both perceptual quality and quantitative performance. We further show its potential on StyleGAN2.

Updates

  • [09/2021] The code of Focal Frequency Loss is released.

  • [07/2021] The paper of Focal Frequency Loss is accepted by ICCV 2021.

Quick Start

Run pip install focal-frequency-loss for installation. Then, the following code is all you need.

from focal_frequency_loss import FocalFrequencyLoss as FFL
ffl = FFL(loss_weight=1.0, alpha=1.0)  # initialize nn.Module class

import torch
fake = torch.randn(4, 3, 64, 64)  # replace it with the predicted tensor of shape (N, C, H, W)
real = torch.randn(4, 3, 64, 64)  # replace it with the target tensor of shape (N, C, H, W)

loss = ffl(fake, real)  # calculate focal frequency loss

Tips:

  1. Current supported PyTorch version: torch>=1.1.0. Warnings can be ignored. Please note that experiments in the paper were conducted with torch<=1.7.1,>=1.1.0.
  2. Arguments to initialize the FocalFrequencyLoss class:
    • loss_weight (float): weight for focal frequency loss. Default: 1.0
    • alpha (float): the scaling factor alpha of the spectrum weight matrix for flexibility. Default: 1.0
    • patch_factor (int): the factor to crop image patches for patch-based focal frequency loss. Default: 1
    • ave_spectrum (bool): whether to use minibatch average spectrum. Default: False
    • log_matrix (bool): whether to adjust the spectrum weight matrix by logarithm. Default: False
    • batch_matrix (bool): whether to calculate the spectrum weight matrix using batch-based statistics. Default: False
  3. Experience shows that the main hyperparameters you need to adjust are loss_weight and alpha. The loss weight may always need to be adjusted first. Then, a larger alpha indicates that the model is more focused. We use alpha=1.0 as default.

Exmaple: Image Reconstruction (Vanilla AE)

As a guide, we provide an example of applying the proposed focal frequency loss (FFL) for Vanilla AE image reconstruction on CelebA. Applying FFL is pretty easy. The core details can be found here.

Installation

After installing Anaconda, we recommend you to create a new conda environment with python 3.8.3:

conda create -n ffl python=3.8.3 -y
conda activate ffl

Clone this repo, install PyTorch 1.4.0 (torch>=1.1.0 may also work) and other dependencies:

git clone https://github.com/EndlessSora/focal-frequency-loss.git
cd focal-frequency-loss
pip install -r VanillaAE/requirements.txt

Dataset Preparation

In this example, please download img_align_celeba.zip of the CelebA dataset from its official website. Then, we highly recommend you to unzip this file and symlink the img_align_celeba folder to ./datasets/celeba by:

bash scripts/datasets/prepare_celeba.sh [PATH_TO_IMG_ALIGN_CELEBA]

Or you can simply move the img_align_celeba folder to ./datasets/celeba. The resulting directory structure should be:

├── datasets
│    ├── celeba
│    │    ├── img_align_celeba  
│    │    │    ├── 000001.jpg
│    │    │    ├── 000002.jpg
│    │    │    ├── 000003.jpg
│    │    │    ├── ...

Test and Evaluation Metrics

Download the pretrained models and unzip them to ./VanillaAE/experiments.

We have provided the example test scripts. If you only have a CPU environment, please specify --no_cuda in the script. Run:

bash scripts/VanillaAE/test/celeba_recon_wo_ffl.sh
bash scripts/VanillaAE/test/celeba_recon_w_ffl.sh

The Vanilla AE image reconstruction results will be saved at ./VanillaAE/results by default.

After testing, you can further calculate the evaluation metrics for this example. We have implemented a series of evaluation metrics we used and provided the metric scripts. Run:

bash scripts/VanillaAE/metrics/celeba_recon_wo_ffl.sh
bash scripts/VanillaAE/metrics/celeba_recon_w_ffl.sh

You will see the scores of different metrics. The metric logs will be saved in the respective experiment folders at ./VanillaAE/results.

Training

We have provided the example training scripts. If you only have a CPU environment, please specify --no_cuda in the script. Run:

bash scripts/VanillaAE/train/celeba_recon_wo_ffl.sh
bash scripts/VanillaAE/train/celeba_recon_w_ffl.sh 

After training, inference on the newly trained models is similar to Test and Evaluation Metrics. The results could be better reproduced on NVIDIA Tesla V100 GPUs with torch<=1.7.1,>=1.1.0.

More Results

Here, we show other examples of applying the proposed focal frequency loss (FFL) under diverse settings.

Image Reconstruction (VAE)

reconvae

Image-to-Image Translation (pix2pix | SPADE)

consynI2I

Unconditional Image Synthesis (StyleGAN2)

256x256 results (without truncation) and the mini-batch average spectra (adjusted to better contrast):

unsynsg2res256

1024x1024 results (without truncation) synthesized by StyleGAN2 with FFL:

unsynsg2res1024

Citation

If you find this work useful for your research, please cite our paper:

@inproceedings{jiang2021focal,
  title={Focal Frequency Loss for Image Reconstruction and Synthesis},
  author={Jiang, Liming and Dai, Bo and Wu, Wayne and Loy, Chen Change},
  booktitle={ICCV},
  year={2021}
}

Acknowledgments

The code of Vanilla AE is inspired by PyTorch DCGAN and MUNIT. Part of the evaluation metric code is borrowed from MMEditing. We also apply LPIPS and pytorch-fid as evaluation metrics.

License

All rights reserved. The code is released under the MIT License.

Copyright (c) 2021

Owner
Liming Jiang
Ph.D. student, [email protected]
Liming Jiang
Tiny-NewsRec: Efficient and Effective PLM-based News Recommendation

Tiny-NewsRec The source codes for our paper "Tiny-NewsRec: Efficient and Effective PLM-based News Recommendation". Requirements PyTorch == 1.6.0 Tensor

Yang Yu 3 Dec 07, 2022
PyTorch implementation of a Real-ESRGAN model trained on custom dataset

Real-ESRGAN PyTorch implementation of a Real-ESRGAN model trained on custom dataset. This model shows better results on faces compared to the original

Sber AI 160 Jan 04, 2023
FedJAX is a library for developing custom Federated Learning (FL) algorithms in JAX.

FedJAX: Federated learning with JAX What is FedJAX? FedJAX is a library for developing custom Federated Learning (FL) algorithms in JAX. FedJAX priori

Google 208 Dec 14, 2022
Data manipulation and transformation for audio signal processing, powered by PyTorch

torchaudio: an audio library for PyTorch The aim of torchaudio is to apply PyTorch to the audio domain. By supporting PyTorch, torchaudio follows the

1.9k Dec 28, 2022
Attention over nodes in Graph Neural Networks using PyTorch (NeurIPS 2019)

Intro This repository contains code to generate data and reproduce experiments from our NeurIPS 2019 paper: Boris Knyazev, Graham W. Taylor, Mohamed R

Boris Knyazev 242 Jan 06, 2023
FSL-Mate: A collection of resources for few-shot learning (FSL).

FSL-Mate is a collection of resources for few-shot learning (FSL). In particular, FSL-Mate currently contains FewShotPapers: a paper list which tracks

Yaqing Wang 1.5k Jan 08, 2023
[2021][ICCV][FSNet] Full-Duplex Strategy for Video Object Segmentation

Full-Duplex Strategy for Video Object Segmentation (ICCV, 2021) Authors: Ge-Peng Ji, Keren Fu, Zhe Wu, Deng-Ping Fan*, Jianbing Shen, & Ling Shao This

Daniel-Ji 55 Dec 22, 2022
Official repository for the ICLR 2021 paper Evaluating the Disentanglement of Deep Generative Models with Manifold Topology

Official repository for the ICLR 2021 paper Evaluating the Disentanglement of Deep Generative Models with Manifold Topology Sharon Zhou, Eric Zelikman

Stanford Machine Learning Group 34 Nov 16, 2022
用强化学习DQN算法,训练AI模型来玩合成大西瓜游戏,提供Keras版本和PARL(paddle)版本

用强化学习玩合成大西瓜 代码地址:https://github.com/Sharpiless/play-daxigua-using-Reinforcement-Learning 用强化学习DQN算法,训练AI模型来玩合成大西瓜游戏,提供Keras版本、PARL(paddle)版本和pytorch版本

72 Dec 17, 2022
An adaptive hierarchical energy management strategy for hybrid electric vehicles

An adaptive hierarchical energy management strategy This project contains the source code of an adaptive hierarchical EMS combining heuristic equivale

19 Dec 13, 2022
Official implementation of the MM'21 paper Constrained Graphic Layout Generation via Latent Optimization

[MM'21] Constrained Graphic Layout Generation via Latent Optimization This repository provides the official code for the paper "Constrained Graphic La

Kotaro Kikuchi 73 Dec 27, 2022
Multi-Output Gaussian Process Toolkit

Multi-Output Gaussian Process Toolkit Paper - API Documentation - Tutorials & Examples The Multi-Output Gaussian Process Toolkit is a Python toolkit f

GAMES 113 Nov 25, 2022
Stereo Hybrid Event-Frame (SHEF) Cameras for 3D Perception, IROS 2021

For academic use only. Stereo Hybrid Event-Frame (SHEF) Cameras for 3D Perception Ziwei Wang, Liyuan Pan, Yonhon Ng, Zheyu Zhuang and Robert Mahony Th

Ziwei Wang 11 Jan 04, 2023
Multi-angle c(q)uestion answering

Macaw Introduction Macaw (Multi-angle c(q)uestion answering) is a ready-to-use model capable of general question answering, showing robustness outside

AI2 430 Jan 04, 2023
5 Jan 05, 2023
OpenGAN: Open-Set Recognition via Open Data Generation

OpenGAN: Open-Set Recognition via Open Data Generation ICCV 2021 (oral) Real-world machine learning systems need to analyze novel testing data that di

Shu Kong 90 Jan 06, 2023
A learning-based data collection tool for human segmentation

FullBodyFilter A Learning-Based Data Collection Tool For Human Segmentation Contents Documentation Source Code and Scripts Overview of Project Usage O

Robert Jiang 4 Jun 24, 2022
HMLLDB is a collection of LLDB commands to assist in the debugging of iOS apps.

HMLLDB is a collection of LLDB commands to assist in the debugging of iOS apps. 中文介绍 Features Non-intrusive. Your iOS project does not need to be modi

mao2020 47 Oct 22, 2022
You Only Look One-level Feature (YOLOF), CVPR2021, Detectron2

You Only Look One-level Feature (YOLOF), CVPR2021 A simple, fast, and efficient object detector without FPN. This repo provides a neat implementation

qiang chen 273 Jan 03, 2023
Official pytorch implementation of "Scaling-up Disentanglement for Image Translation", ICCV 2021.

Official pytorch implementation of "Scaling-up Disentanglement for Image Translation", ICCV 2021.

Aviv Gabbay 41 Nov 29, 2022