FlexConv: Continuous Kernel Convolutions with Differentiable Kernel Sizes

Related tags

Deep Learningflexconv
Overview

FlexConv: Continuous Kernel Convolutions with Differentiable Kernel Sizes

This repository contains the source code accompanying the paper:

FlexConv: Continuous Kernel Convolutions with Differentiable Kernel Sizes [Slides] [Poster]
David W. Romero*, Robert-Jan Bruintjes*, Jakub M. Tomczak, Erik J. Bekkers, Mark Hoogendoorn & Jan C. van Gemert.

PWC PWC PWC PWC

Abstract

When designing Convolutional Neural Networks (CNNs), one must select the size of the convolutional kernels before training. Recent works show CNNs benefit from different kernel sizes at different layers, but exploring all possible combinations is unfeasible in practice. A more efficient approach is to learn the kernel size during training. However, existing works that learn the kernel size have a limited bandwidth. These approaches scale kernels by dilation, and thus the detail they can describe is limited. In this work, we propose FlexConv, a novel convolutional operation with which high bandwidth convolutional kernels of learnable kernel size can be learned at a fixed parameter cost. FlexNets model long-term dependencies without the use of pooling, achieve state-of-the-art performance on several sequential datasets, outperform recent works with learned kernel sizes, and are competitive with much deeper ResNets on image benchmark datasets. Additionally, FlexNets can be deployed at higher resolutions than those seen during training. To avoid aliasing, we propose a novel kernel parameterization with which the frequency of the kernels can be analytically controlled. Our novel kernel parameterization shows higher descriptive power and faster convergence speed than existing parameterizations. This leads to important improvements in classification accuracy.

drawing

Repository structure

This repository is organized as follows:

  • ckconv contains the main PyTorch library of our model.

  • models and datasets contain the models and datasets used throughout our experiments;

  • cfg contains the default configuration of our run_*.py scripts, in YAML. We use Hydra with OmegaConf to manage the configuration of our experiments.

  • experiments contains commands to replicate the experiments from the paper.

  • ckernel_fitting contains source code to run experiments to approximate convolutional filters via MLPs. Please see ckernel_fitting/README.md for further details.

Using the code

Image classification experiments are run with run_experiment.py. Cross-resolution image classification experiments are run with run_crossres.py, which trains on the source resolution for train.epochs epochs, before finetuning on the target resolution for cross_res.finetune_epochs epochs. The code can also be profiled using PyTorch's profiling tools with run_profiler.py.

Flags are handled by Hydra. See cfg/config.yaml for all available flags. Flags can be passed as xxx.yyy=value.

Useful flags

  • net.* describes settings for the FlexNet models (model definition models/ckresnet.py).
  • kernel.* describes settings for the MAGNet kernel generators in FlexConvs, for any model definition that uses FlexConvs.
  • kernel.regularize_params.* describes settings for the anti-aliasing regularization.
    • target=gabor regularizes without the FlexConv Gaussian mask; target=gabor+mask regularized including the FlexConv mask.
  • mask.* describes settings for the FlexConv Gaussian mask.
  • conv.* describes settings for the convolution to use in FlexNet, excluding MAGNet settings. Can be used to switch between FlexConv, CKConv and regular Conv.
  • debug=True: By default, all experiment scripts connect to Weights & Biases to log the experimental results. Use this flag to run without connecting to Weights & Biases.
  • pretrained and related flags: Use these to load checkpoints before training, either from a local file (pretrained and pretrained_params.filepath) or from Weights & Biases (pretrained_wandb and associated flags).
    • In cross-res training, flags can be combined to fine-tune from an existing source res model. Pre-load the final model trained at source resolution (by specifying the correct file), and set train.epochs=0 so source res training is skipped.
  • train.do=False: Only test the model. Useful in combination with pre-training.
    • Note that this flag doesn't work in cross-res training.

Install

conda (recommended)

In order to reproduce our results, please first install the required dependencies. This can be done by:

conda env create -f conda_requirements.yaml

This will create the conda environment flexconv with the correct dependencies.

pip

The same conda environment can be created with pip by running:

conda create -n flexconv python=3.8.5
conda install pytorch==1.9.0 torchvision==0.10.0 torchaudio=0.9.0 cudatoolkit=10.2 -c pytorch
conda activate flexconv
pip install -r requirements.txt

Reproducing experiments

Please see the Experiments readme for details on reproducing the paper's experiments, including checkpoints for selected models.

Cite

If you found this work useful in your research, please consider citing:

@misc{romero2021flexconv,
      title={FlexConv: Continuous Kernel Convolutions with Differentiable Kernel Sizes}, 
      author={David W. Romero and Robert-Jan Bruintjes and Jakub M. Tomczak and Erik J. Bekkers and Mark Hoogendoorn and Jan C. van Gemert},
      year={2021},
      eprint={2110.08059},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgements

We thank Nergis Tömen for her valuable insights regarding signal processing principles for FlexConv, and Silvia-Laura Pintea for explanations and access to code of her work (Pintea et al., 2021). We thank Yerlan Idelbayev for the use of the CIFAR ResNet code.

This work is supported by the Qualcomm Innovation Fellowship (2021) granted to David W. Romero. David W. Romero sincerely thanks Qualcomm for his support. David W. Romero is financed as part of the Efficient Deep Learning (EDL) programme (grant number P16-25), partly funded by the Dutch Research Council (NWO). Robert-Jan Bruintjes is financed by the Dutch Research Council (NWO) (project VI.Vidi.192.100). All authors sincerely thank everyone involved in funding this work.

This work was partially carried out on the Dutch national infrastructure with the support of SURF Cooperative. We used Weights & Biases for experiment tracking and visualization.

You might also like...
PyTorch implementation of the R2Plus1D convolution based ResNet architecture described in the paper "A Closer Look at Spatiotemporal Convolutions for Action Recognition"

R2Plus1D-PyTorch PyTorch implementation of the R2Plus1D convolution based ResNet architecture described in the paper "A Closer Look at Spatiotemporal

RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition

RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition (PyTorch) Paper: https://arxiv.org/abs/2105.01883 Citation: @

an implementation of Revisiting Adaptive Convolutions for Video Frame Interpolation using PyTorch
an implementation of Revisiting Adaptive Convolutions for Video Frame Interpolation using PyTorch

revisiting-sepconv This is a reference implementation of Revisiting Adaptive Convolutions for Video Frame Interpolation [1] using PyTorch. Given two f

Unofficial pytorch implementation of 'Image Inpainting for Irregular Holes Using Partial Convolutions'
Unofficial pytorch implementation of 'Image Inpainting for Irregular Holes Using Partial Convolutions'

pytorch-inpainting-with-partial-conv Official implementation is released by the authors. Note that this is an ongoing re-implementation and I cannot f

Simple Tensorflow implementation of
Simple Tensorflow implementation of "Adaptive Convolutions for Structure-Aware Style Transfer" (CVPR 2021)

AdaConv — Simple TensorFlow Implementation [Paper] : Adaptive Convolutions for Structure-Aware Style Transfer (CVPR 2021) Note This repository does no

TART - A PyTorch implementation for Transition Matrix Representation of Trees with Transposed Convolutions

TART This project is a PyTorch implementation for Transition Matrix Representati

MiraiML: asynchronous, autonomous and continuous Machine Learning in Python

MiraiML Mirai: future in japanese. MiraiML is an asynchronous engine for continuous & autonomous machine learning, built for real-time usage. Usage In

Learning Continuous Image Representation with Local Implicit Image Function
Learning Continuous Image Representation with Local Implicit Image Function

LIIF This repository contains the official implementation for LIIF introduced in the following paper: Learning Continuous Image Representation with Lo

[CVPR'21] FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space
[CVPR'21] FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space

FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space by Quande Liu, Cheng Chen, Ji

Comments
  • Simple Example does not work

    Simple Example does not work

    Hey there!

    Thanks for the great work and open source code.

    I have tried a very simple example but couldnt get it to work:

    import torch
    import torch.nn as nn
    import torch.nn.functional as F
    import ckconv
    from ckconv.nn import CKConv
    from omegaconf import OmegaConf
    
    
    kernel_config = OmegaConf.create({"type": "MLP", "dim_linear": 2, "no_hidden": 2, "no_layers": 3, "activ_function": "ReLU","norm": "BatchNorm","omega_0": 1,"learn_omega_0": False,"weight_norm": False,"steerable": False,"init_spatial_value": 1.0,"bias_init": None,"input_scale": 25.6,"sampling_rate_norm": 1.0,"regularize": False,"regularize_params": {"res": 0 ,"res_offset": 0,"target": "gabor+mask","fn": "l2_relu","method":"together","factor": 0.001,"gauss_stddevs": 2.0,"gauss_factor": 0.5},"srf": {"scale": 0.}})
    
    
    conv_config = OmegaConf.create({"type": "","use_fft": False, "bias": True,"padding": "same","stride": 1,"horizon": "same","cache": False })
    
    class Net(nn.Module):
        def __init__(self):
            super().__init__()
            
            self.conv1 = CKConv(3, 6, kernel_config, conv_config) # nn.Conv2d(3, 6, 5) --> original conv that works
            self.pool = nn.MaxPool2d(2, 2)
            self.conv2 = nn.Conv2d(6, 16, 5)
            self.fc1 = nn.Linear(16 * 5 * 5, 120)
            self.fc2 = nn.Linear(120, 84)
            self.fc3 = nn.Linear(84, 10)
    
        def forward(self, x):
            print("x: ", x.shape)
            y = self.conv1(x)
            print("y: ", y.shape)
            x = self.pool(F.relu(y))
            x = self.pool(F.relu(self.conv2(x)))
            x = torch.flatten(x, 1) # flatten all dimensions except batch
            x = F.relu(self.fc1(x))
            x = F.relu(self.fc2(x))
            x = self.fc3(x)
            return x
    
    
    net = Net()
    
    
    inn = torch.randn((1,3, 28, 28))
    out = net(inn)
    
    

    -->

    RuntimeError: Given weight of size [2, 2, 1, 1], expected bias to be 1-dimensional with 2 elements, but got bias of size [2, 2] instead
    

    (you can ignore everything after the first conv, borrowed from pytorch examples)

    I tried different configuration (above is only one example).

    Thanks for any help :)

    opened by marcown 4
  • Refactor of ckconv.nn APIs + demo notebook for Arxiv paper

    Refactor of ckconv.nn APIs + demo notebook for Arxiv paper

    Major changes

    • APIs of CKConv and FlexConv now take parameters (instead of ConfigDicts) and have default values, for ease of use.
    • Added demo notebooks, to showcase usage of FlexConv.
    • Added testcases: use testcase.save & testcase.load to save/load a string of training losses to/from file, as a fingerprint for the training run. When loading, if the fingerprint doesn't match, the testcase raises an AssertionError. We use this to verify that any changed code does not change the training behavior.
      • Specifically, I implemented and used this to verify that the other listed changes do not affect the reproducibility of the paper's experiments with this codebase.

    Minor changes

    • regularize_gabornet()s arguments were trimmed.
    opened by rjbruin 1
  • CVE-2007-4559 Patch

    CVE-2007-4559 Patch

    Patching CVE-2007-4559

    Hi, we are security researchers from the Advanced Research Center at Trellix. We have began a campaign to patch a widespread bug named CVE-2007-4559. CVE-2007-4559 is a 15 year old bug in the Python tarfile package. By using extract() or extractall() on a tarfile object without sanitizing input, a maliciously crafted .tar file could perform a directory path traversal attack. We found at least one unsantized extractall() in your codebase and are providing a patch for you via pull request. The patch essentially checks to see if all tarfile members will be extracted safely and throws an exception otherwise. We encourage you to use this patch or your own solution to secure against CVE-2007-4559. Further technical information about the vulnerability can be found in this blog.

    If you have further questions you may contact us through this projects lead researcher Kasimir Schulz.

    opened by TrellixVulnTeam 0
Releases(v1.1)
Owner
Robert-Jan Bruintjes
PhD student on visual inductive priors @ TU Delft
Robert-Jan Bruintjes
Homepage of paper: Paint Transformer: Feed Forward Neural Painting with Stroke Prediction, ICCV 2021.

Paint Transformer: Feed Forward Neural Painting with Stroke Prediction [Paper] [Official Paddle Implementation] [Huggingface Gradio Demo] [Unofficial

442 Dec 16, 2022
PyTorch Live is an easy to use library of tools for creating on-device ML demos on Android and iOS.

PyTorch Live is an easy to use library of tools for creating on-device ML demos on Android and iOS. With Live, you can build a working mobile app ML demo in minutes.

559 Jan 01, 2023
This was initially the repo for the project of [email protected] of Asaf Mazar, Millad Kassaie and Georgios Chochlakis named "Powered by the Will? Exploring Lay Theories of Behavior Change through Social Media"

Subreddit Analysis This repo includes tools for Subreddit analysis, originally developed for our class project of PSYC 626 in USC, titled "Powered by

Georgios Chochlakis 1 Dec 17, 2021
MWPToolkit is a PyTorch-based toolkit for Math Word Problem (MWP) solving.

MWPToolkit is a PyTorch-based toolkit for Math Word Problem (MWP) solving. It is a comprehensive framework for research purpose that integrates popular MWP benchmark datasets and typical deep learnin

119 Jan 04, 2023
Train neural network for semantic segmentation (deep lab V3) with pytorch in less then 50 lines of code

Train neural network for semantic segmentation (deep lab V3) with pytorch in 50 lines of code Train net semantic segmentation net using Trans10K datas

17 Dec 19, 2022
This repo contains the implementation of YOLOv2 in Keras with Tensorflow backend.

Easy training on custom dataset. Various backends (MobileNet and SqueezeNet) supported. A YOLO demo to detect raccoon run entirely in brower is accessible at https://git.io/vF7vI (not on Windows).

Huynh Ngoc Anh 1.7k Dec 24, 2022
Experiments and examples converting Transformers to ONNX

Experiments and examples converting Transformers to ONNX This repository containes experiments and examples on converting different Transformers to ON

Philipp Schmid 4 Dec 24, 2022
SuMa++: Efficient LiDAR-based Semantic SLAM (Chen et al IROS 2019)

SuMa++: Efficient LiDAR-based Semantic SLAM This repository contains the implementation of SuMa++, which generates semantic maps only using three-dime

Photogrammetry & Robotics Bonn 701 Dec 30, 2022
[ICLR'19] Trellis Networks for Sequence Modeling

TrellisNet for Sequence Modeling This repository contains the experiments done in paper Trellis Networks for Sequence Modeling by Shaojie Bai, J. Zico

CMU Locus Lab 460 Oct 13, 2022
NAS-Bench-x11 and the Power of Learning Curves

NAS-Bench-x11 NAS-Bench-x11 and the Power of Learning Curves Shen Yan, Colin White, Yash Savani, Frank Hutter. NeurIPS 2021. Surrogate NAS benchmarks

AutoML-Freiburg-Hannover 13 Nov 18, 2022
A Pytorch reproduction of Range Loss, which is proposed in paper 《Range Loss for Deep Face Recognition with Long-Tailed Training Data》

RangeLoss Pytorch This is a Pytorch reproduction of Range Loss, which is proposed in paper 《Range Loss for Deep Face Recognition with Long-Tailed Trai

Youzhi Gu 7 Nov 27, 2021
UMPNet: Universal Manipulation Policy Network for Articulated Objects

UMPNet: Universal Manipulation Policy Network for Articulated Objects Zhenjia Xu, Zhanpeng He, Shuran Song Columbia University Robotics and Automation

Columbia Artificial Intelligence and Robotics Lab 33 Dec 03, 2022
Back to Basics: Efficient Network Compression via IMP

Back to Basics: Efficient Network Compression via IMP Authors: Max Zimmer, Christoph Spiegel, Sebastian Pokutta This repository contains the code to r

IOL Lab @ ZIB 1 Nov 19, 2021
The Simplest DCGAN Implementation

DCGAN in TensorLayer This is the TensorLayer implementation of Deep Convolutional Generative Adversarial Networks. Looking for Text to Image Synthesis

TensorLayer Community 310 Dec 13, 2022
Another pytorch implementation of FCN (Fully Convolutional Networks)

FCN-pytorch-easiest Trying to be the easiest FCN pytorch implementation and just in a get and use fashion Here I use a handbag semantic segmentation f

Y. Dong 158 Dec 21, 2022
Spherical CNNs

Spherical CNNs Equivariant CNNs for the sphere and SO(3) implemented in PyTorch Overview This library contains a PyTorch implementation of the rotatio

Jonas Köhler 893 Dec 28, 2022
Malmo Collaborative AI Challenge - Team Pig Catcher

The Malmo Collaborative AI Challenge - Team Pig Catcher Approach The challenge involves 2 agents who can either cooperate or defect. The optimal polic

Kai Arulkumaran 66 Jun 29, 2022
official Pytorch implementation of ICCV 2021 paper FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting.

FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting By Rui Liu, Hanming Deng, Yangyi Huang, Xiaoyu Shi, Lewei Lu, Wenxiu

77 Dec 27, 2022
HALO: A Skeleton-Driven Neural Occupancy Representation for Articulated Hands

HALO: A Skeleton-Driven Neural Occupancy Representation for Articulated Hands Oral Presentation, 3DV 2021 Korrawe Karunratanakul, Adrian Spurr, Zicong

Korrawe Karunratanakul 43 Oct 07, 2022
CLADE - Efficient Semantic Image Synthesis via Class-Adaptive Normalization (TPAMI 2021)

Efficient Semantic Image Synthesis via Class-Adaptive Normalization (Accepted by TPAMI)

tzt 49 Nov 17, 2022