Reference code for the paper "Cross-Camera Convolutional Color Constancy" (ICCV 2021)

Last update: Jan 07, 2023

Overview

Cross-Camera Convolutional Color Constancy, ICCV 2021 (Oral)

Mahmoud Afifi^1,2, Jonathan T. Barron², Chloe LeGendre², Yun-Ta Tsai², and Francois Bleibel²

¹York University ²Google Research

Reference code for the paper Cross-Camera Convolutional Color Constancy. Mahmoud Afifi, Jonathan T. Barron, Chloe LeGendre, Yun-Ta Tsai, and Francois Bleibel. In ICCV, 2021. If you use this code, please cite our paper:

@InProceedings{C5,
  title={Cross-Camera Convolutional Color Constancy},
  author={Afifi, Mahmoud and Barron, Jonathan T and LeGendre, Chloe and Tsai, Yun-Ta and Bleibel, Francois},
  booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
  year={2021}
}

Code

Prerequisite

Pytorch
opencv-python
tqdm

Training

To train C5, training/validation data should have the following formatting:

- train_folder/
       | image1_sensorname_camera1.png
       | image1_sensorname_camera1_metadata.json
       | image2_sensorname_camera1.png
       | image2_sensorname_camera1_metadata.json
       ...
       | image1_sensorname_camera2.png
       | image1_sensorname_camera2_metadata.json
       ...

In src/ops.py, the function add_camera_name(dataset_dir) can be used to rename image filenames and corresponding ground-truth JSON files. Each JSON file should include a key named either illuminant_color_raw or gt_ill that has the ground-truth illuminant color of the corresponding image.

The training code is given in train.py. The following parameters are required to set model configuration and training data information.

--data-num: the number of images used for each inference (additional images + input query image). This was mentioned in the main paper as m.
--input-size: number of histogram bins.
--learn-G: to use a G multiplier as explained in the paper.
--training-dir-in: training image directory.
--validation-dir-in: validation image directory; when this variable is None (default), the validation set will be taken from the training data based on the --validation-ratio.
--validation-ratio: when --validation-dir-in is None, this argument determines the validation set ratio of the image set in --training-dir-in directory.
--augmentation-dir: directory(s) of augmentation data (optional).
--model-name: name of the trained model.

The following parameters are useful to control training settings and hyperparameters:

--epochs: number of epochs
--batch-size: batch size
--load-hist: to load histogram if pre-computed (recommended).
-optimizer: optimization algorithm for stochastic gradient descent; options are: Adam or SGD.
--learning-rate: Learning rate
--l2reg: L2 regularization factor
--load: to load C5 model from a .pth file; default is False
--model-location: when --load is True, this variable should point to the fullpath of the .pth model file.
--validation-frequency: validation frequency (in epochs).
--cross-validation: To use three-fold cross-validation. When this variable is True, --validation-dir-in and --validation-ratio will be ignored and 3-fold cross-validation, on the data provided in the --training-dir-in, will be applied.
--gpu: GPU device ID.
--smoothness-factor-*: smoothness loss factor of the following model components: F (conv filter), B (bias), G (multiplier layer). For example, --smoothness-factor-F can be used to set the smoothness loss for the conv filter.
--increasing-batch-size: for increasing batch size during training.
--grad-clip-value: gradient clipping value; if it's set to 0 (default), no clipping is applied.

Testing

To test a pre-trained C5 model, testing data should have the following formatting:

- test_folder/
       | image1_sensorname_camera1.png
       | image1_sensorname_camera1_metadata.json
       | image2_sensorname_camera1.png
       | image2_sensorname_camera1_metadata.json
       ...
       | image1_sensorname_camera2.png
       | image1_sensorname_camera2_metadata.json
       ...

The testing code is given in test.py. The following parameters are required to set model configuration and testing data information.

--model-name: name of the trained model.
--data-num: the number of images used for each inference (additional images + input query image). This was mentioned in the main paper as m.
--input-size: number of histogram bins.
--g-multiplier: to use a G multiplier as explained in the paper.
--testing-dir-in: testing image directory.
--batch-size: batch size
--load-hist: to load histogram if pre-computed (recommended).
--multiple_test: to apply multiple tests (ten as mentioned in the paper) and save their results.
--white-balance: to save white-balanced testing images.
--cross-validation: to use three-fold cross-validation. When it is set to True, it is supposed to have three pre-trained models saved with a postfix of the fold number. The testing image filenames should be listed in .npy files located in the folds directory with the same name of the dataset, which should be the same as the folder name in --testing-dir-in.
--gpu: GPU device ID.

In the images directory, there are few examples captured by Mobile Sony IMX135 from the INTEL-TAU dataset. To white balance these raw images, as shown in the figure below, using a C5 model (trained on DSLR cameras from NUS and Gehler-Shi datasets), use the following command:

python test.py --testing-dir-in ./images --white-balance True --model-name C5_m_7_h_64

To test with the gain multiplie, use the following command:

python test.py --testing-dir-in ./images --white-balance True --g-multiplier True --model-name C5_m_7_h_64_w_G

Note that in testing, C5 does not require any metadata. The testing code only uses JSON files to load ground-truth illumination for comparisons with our estimated values.

Data augmentation

The raw-to-raw augmentation functions are provided in src/aug_ops.opy. Call the set_sampling_params function to set sampling parameters (e.g., excluding certain camera/dataset from the soruce set, determine the number of augmented images, etc.). Then, call the map_raw_images function to generate a new augmentation set with the determined parameters. The function map_raw_images takes four arguments:

xyz_img_dir: directory of XYZ images; you can download the CIE XYZ images from here. All images were transformed to the CIE XYZ space after applying the black-level normalization and masking out the calibration object (i.e., the color rendition chart or SpyderCUBE).
target_cameras: a list of one or more of the following camera models: Canon EOS 550D, Canon EOS 5D, Canon EOS-1DS, Canon EOS-1Ds Mark III, Fujifilm X-M1, Nikon D40, Nikon D5200, Olympus E-PL6, Panasonic DMC-GX1, Samsung NX2000, Sony SLT-A57, or All.
output_dir: output directory to save the augmented images and their metadata files.
params: sampling parameters set by the set_sampling_params function.

Reference code for the paper "Cross-Camera Convolutional Color Constancy" (ICCV 2021)

Related tags

Overview

Cross-Camera Convolutional Color Constancy, ICCV 2021 (Oral)

Code

Prerequisite

Training

Testing

Data augmentation

Owner

Mahmoud Afifi

UNAVOIDS: Unsupervised and Nonparametric Approach for Visualizing Outliers and Invariant Detection Scoring

YOLOX_AUDIO is an audio event detection model based on YOLOX

NeurIPS 2021 paper 'Representation Learning on Spatial Networks' code

PyTorch implementation of Neural View Synthesis and Matching for Semi-Supervised Few-Shot Learning of 3D Pose

Data-Uncertainty Guided Multi-Phase Learning for Semi-supervised Object Detection

PyTorch implementation of Histogram Layers from DeepHist: Differentiable Joint and Color Histogram Layers for Image-to-Image Translation

Joint Versus Independent Multiview Hashing for Cross-View Retrieval[J] (IEEE TCYB 2021, PyTorch Code)

PaRT: Parallel Learning for Robust and Transparent AI

IndoNLI: A Natural Language Inference Dataset for Indonesian

Building a real-time environment using webcam frame division in OpenCV and classify cropped images using a fine-tuned vision transformers on hybryd datasets samples for facial emotion recognition.

Code for the AI lab course 2021/2022 of the University of Verona

LocUNet is a deep learning method to localize a UE based solely on the reported signal strengths from a set of BSs.

Implementation for paper "STAR: A Structure-aware Lightweight Transformer for Real-time Image Enhancement" (ICCV 2021).

U-Net for GBM

Relative Positional Encoding for Transformers with Linear Complexity

A knowledge base construction engine for richly formatted data

Python script that takes an Impulse response .wav and a input .wav to demonstrate audio convolution.

Development Kit for the SoccerNet Challenge

Learning from Guided Play: A Scheduled Hierarchical Approach for Improving Exploration in Adversarial Imitation Learning Source Code

The code for "Deep Level Set for Box-supervised Instance Segmentation in Aerial Images".