The code for our NeurIPS 2021 paper "Kernelized Heterogeneous Risk Minimization".

Overview

Kernelized-HRM

Jiashuo Liu, Zheyuan Hu

The code for our NeurIPS 2021 paper "Kernelized Heterogeneous Risk Minimization"[1]. This repo contains the codes for our Classification with Spurious Correlation and Regression with Selection Bias simulated experiments, including the data generation process, the whole Kernelized-HRM algorithm and the testing process.

Details

There are two files, named KernelHRM_sim1.py and KernelHRM_sim2.py, which contains the code for the classification simulation experiment and the regression simulation experiment, respectively. The details of codes are:

  • generate_data_list: generate data according to the given parameters args.r_list.

  • generate_test_data_list: generate the test data for Selection Bias experiment, where the args.r_list is pre-defined to [-2.9,-2.7,...,-1.9].

  • main_KernelHRM: the whole framework for our Kernelized-HRM algorithm.

Hypermeters

There are many hyper-parameters to be tuned for the whole framework, which are different among different tasks and require users to carefully tune. Note that although we provide the hyper-parameters for the simulated experiments, it is possible that the results are not exactly the same as ours, which may due to the randomness or something else.

Generally, the following hyper-parameters need carefully tuned:

  • k: controls the dimension of reduced neural tangent features
  • whole_epoch: controls the overall number of iterations between the frontend and the backend
  • epochs: controls the number of epochs of optimizing the invariant learning module in each iteration
  • IRM_lam: controls the strength of the regularizer for the invariant learning
  • lr: learning rate
  • cluster_num: controls the number of clusters

Further, for the experimental settings, the following parameters need to be specified:

  • r_list: controls the strength of spurious correlations
  • scramble: similar to IRM[2], whether to mix the raw features
  • num_list: controls the number of data points from each environment

As for the optimal hyper-parameters for our simulation experiments, we put them into the reproduce.sh file.

Others

Similar to HRM[3], we view the proposed Kernelized-HRM as a framework, which converts the non-linear and complicated data into linear and raw feature data by neural tangent kernel and includes the clustering module and the invariant prediction module. In practice, one can replace each model to anything they want with the same effect.

Though I hate to mention it, our method has the following shortcomings:

  • Just like the original HRM[3], the convergence of the frontend module cannot be guaranteed, and we notice that there may be some cases the next iteration does not improve the current results or even hurts.
  • Hyper-parameters for different tasks may be quite different and need to be tuned carefully.
  • Whether this algorithm can be extended to more complicated image data, such as PACS, NICO et al. remains to be seen.(Maybe later we will have a try?)

Reference

[1] Jiasuho Liu, Zheyuan Hu, Peng Cui, Bo Li, Zheyan Shen. Kernelized Heterogeneous Risk Minimization. In NeurIPS 2021.

[2] Arjovsky M, Bottou L, Gulrajani I, et al. Invariant risk minimization.

[3] Jiashuo Liu, Zheyuan Hu, Peng Cui, Bo Li, Zheyan Shen. Heterogeneous Risk Minimziation. In ICML 2021.

Owner
Liu Jiashuo
THU-TrustAI(THU-TAI) Group
Liu Jiashuo
Demonstrates iterative FGSM on Apple's NeuralHash model.

apple-neuralhash-attack Demonstrates iterative FGSM on Apple's NeuralHash model. TL;DR: It is possible to apply noise to CSAM images and make them loo

Lim Swee Kiat 11 Jun 23, 2022
Inference pipeline for our participation in the FeTA challenge 2021.

feta-inference Inference pipeline for our participation in the FeTA challenge 2021. Team name: TRABIT Installation Download the two folders in https:/

Lucas Fidon 2 Apr 13, 2022
A heterogeneous entity-augmented academic language model based on Open Academic Graph (OAG)

Library | Paper | Slack We released two versions of OAG-BERT in CogDL package. OAG-BERT is a heterogeneous entity-augmented academic language model wh

THUDM 58 Dec 17, 2022
Circuit Training: An open-source framework for generating chip floor plans with distributed deep reinforcement learning

Circuit Training: An open-source framework for generating chip floor plans with distributed deep reinforcement learning. Circuit Training is an open-s

Google Research 479 Dec 25, 2022
R3Det based on mmdet 2.19.0

R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object Installation # install mmdetection first if you haven't installed it

SJTU-Thinklab-Det 38 Dec 15, 2022
Minimalist Error collection Service compatible with Rollbar clients. Sentry or Rollbar alternative.

Minimalist Error collection Service Features Compatible with any Rollbar client(see https://docs.rollbar.com/docs). Just change the endpoint URL to yo

Haukur Rósinkranz 381 Nov 11, 2022
A library for finding knowledge neurons in pretrained transformer models.

knowledge-neurons An open source repository replicating the 2021 paper Knowledge Neurons in Pretrained Transformers by Dai et al., and extending the t

EleutherAI 96 Dec 21, 2022
Contains a bunch of different python programm tasks

py_tasks Contains a bunch of different python programm tasks Armstrong.py - calculate Armsrong numbers in range from 0 to n with / without cache and c

Dmitry Chmerenko 1 Dec 17, 2021
CAMPARI: Camera-Aware Decomposed Generative Neural Radiance Fields

CAMPARI: Camera-Aware Decomposed Generative Neural Radiance Fields Paper | Supplementary | Video | Poster If you find our code or paper useful, please

26 Nov 29, 2022
Image-to-Image Translation in PyTorch

CycleGAN and pix2pix in PyTorch New: Please check out contrastive-unpaired-translation (CUT), our new unpaired image-to-image translation model that e

Jun-Yan Zhu 19k Jan 07, 2023
PyTorch implementation of DeepLab v2 on COCO-Stuff / PASCAL VOC

DeepLab with PyTorch This is an unofficial PyTorch implementation of DeepLab v2 [1] with a ResNet-101 backbone. COCO-Stuff dataset [2] and PASCAL VOC

Kazuto Nakashima 995 Jan 08, 2023
Dungeons and Dragons randomized content generator

Component based Dungeons and Dragons generator Supports Entity/Monster Generation NPC Generation Weapon Generation Encounter Generation Environment Ge

Zac 3 Dec 04, 2021
A library for preparing, training, and evaluating scalable deep learning hybrid recommender systems using PyTorch.

collie_recs Collie is a library for preparing, training, and evaluating implicit deep learning hybrid recommender systems, named after the Border Coll

ShopRunner 97 Jan 03, 2023
A collection of scripts I developed for personal and working projects.

A collection of scripts I developed for personal and working projects Table of contents Introduction Repository diagram structure List of scripts pyth

Gianluca Bianco 109 Dec 26, 2022
First-Order Probabilistic Programming Language

FOPPL: A First-Order Probabilistic Programming Language This is an implementation of FOPPL, an S-expression based probabilistic programming language d

Renato Costa 23 Dec 20, 2022
PyTorch wrapper for Taichi data-oriented class

Stannum PyTorch wrapper for Taichi data-oriented class PRs are welcomed, please see TODOs. Usage from stannum import Tin import torch data_oriented =

86 Dec 23, 2022
Pytorch implementation of the paper Improving Text-to-Image Synthesis Using Contrastive Learning

T2I_CL This is the official Pytorch implementation of the paper Improving Text-to-Image Synthesis Using Contrastive Learning Requirements Linux Python

42 Dec 31, 2022
Code for "Training Neural Networks with Fixed Sparse Masks" (NeurIPS 2021).

Code for "Training Neural Networks with Fixed Sparse Masks" (NeurIPS 2021).

Varun Nair 37 Dec 30, 2022
Official code for Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset

Official code for our Interspeech 2021 - Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset [1]*. Visually-grounded spoken language datasets c

Ian Palmer 3 Jan 26, 2022
PyTorch3D is FAIR's library of reusable components for deep learning with 3D data

Introduction PyTorch3D provides efficient, reusable components for 3D Computer Vision research with PyTorch. Key features include: Data structure for

Facebook Research 6.8k Jan 01, 2023