Boosted CVaR Classification (NeurIPS 2021)

Last update: Feb 15, 2022

Related tags

Deep Learning boosted_cvar

Overview

Boosted CVaR Classification

Runtian Zhai, Chen Dan, Arun Sai Suggala, Zico Kolter, Pradeep Ravikumar
NeurIPS 2021

Quick Start
- Train
- Evaluation
Introduction
Algorithms
Parameters
Citation and Contact

Quick Start

Before running the code, please install all the required packages in requirements.txt by running:

pip install -r requirements.txt

In the code, we solve linear programs with the MOSEK solver, which requires a license. You can acquire a free academic license from https://www.mosek.com/products/academic-licenses/. Please make sure that the license file is placed in the correct folder so that the solver could work.

Train

To train a set of base models with boosting, run the following shell command:

python train.py --dataset [DATASET] --data_root /path/to/dataset 
                --alg [ALGORITHM] --epochs [EPOCHS] --iters_per_epoch [ITERS]
                --scheduler [SCHEDULER] --warmup [WARMUP_EPOCHS] --seed [SEED]

Use the --download option to download the dataset if you are running for the first time. Use the --save_file option to save your training results into a .mat file. Set the training hyperparameters with --alpha, --beta and --eta.

For example, to train a set of base models on Cifar-10 with AdaLPBoost, use the following shell command:

python train.py --dataset cifar10 --data_root data --alg adalpboost 
                --eta 1.0 --epochs 100 --iters_per_epoch 5000
                --scheduler 2000,4000 --warmup 20 --seed 2021
                --save_file cifar10.mat

Evaluation

To evaluate the models trained with the above command, run:

python test.py --file cifar10.mat

Introduction

In this work, we study the CVaR classification problem, which requires a classifier to have low α-CVaR loss, i.e. low average loss over the worst α fraction of the samples in the dataset. While previous work showed that no deterministic model learning algorithm can achieve a lower α-CVaR loss than ERM, we address this issue by learning randomized models. Specifically we propose the Boosted CVaR Classification framework that learns ensemble models via Boosting. Our motivation comes from the direct relationship between the CVaR loss and the LPBoost objective. We implement two algorithms based on the framework: one uses LPBoost, and the other named AdaLPBoost uses AdaBoost to pick the sample weights and LPBoost to pick the model weights.

Algorithms

We implement three algorithms in algs.py:

Name	Description
uniform	All sample weight vectors are uniform distributions.
lpboost	Regularized LPBoost (set `--beta` for regularization).
adalpboost	α-AdaLPBoost.

train.py only trains the base models. After the base models are trained, use test.py to select the model weights by solving the dual LPBoost problem.

Parameters

All default training parameters can be found in config.py. For Regularized LPBoost we use β = 100 for all α. For AdaLPBoost we use η = 1.0.

Citation and Contact

To cite this work, please use the following BibTex entry:

@inproceedings{zhai2021boosted,
  author = {Zhai, Runtian and Dan, Chen and Suggala, Arun Sai and Kolter, Zico and Ravikumar, Pradeep},
  booktitle = {Advances in Neural Information Processing Systems},
  title = {Boosted CVaR Classification},
  volume = {34},
  year = {2021}
}

To contact us, please email to the following address: Runtian Zhai <[email protected]>

Boosted CVaR Classification (NeurIPS 2021)

Related tags

Overview

Boosted CVaR Classification

Table of Contents

Quick Start

Train

Evaluation

Introduction

Algorithms

Parameters

Citation and Contact

Owner

Runtian Zhai

A PyTorch implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis

A Simple Key-Value Data-store written in Python

Measuring if attention is explanation with ROAR

The personal repository of the work: DanceNet3D: Music Based Dance Generation with Parametric Motion Transformer.

Writeups for the challenges from DownUnderCTF 2021

Official implementation of paper "Query2Label: A Simple Transformer Way to Multi-Label Classification".

Vector AI — A platform for building vector based applications. Encode, query and analyse data using vectors.

An essential implementation of BYOL in PyTorch + PyTorch Lightning

OCR Post Correction for Endangered Language Texts

Efficient Sharpness-aware Minimization for Improved Training of Neural Networks

Real-time face detection and emotion/gender classification using fer2013/imdb datasets with a keras CNN model and openCV.

Rotation-Only Bundle Adjustment

Happywhale - Whale and Dolphin Identification Silver🥈 Solution (26/1588)

Experimental solutions to selected exercises from the book [Advances in Financial Machine Learning by Marcos Lopez De Prado]

Keyword2Text This repository contains the code of the paper: "A Plug-and-Play Method for Controlled Text Generation"

A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains (IJCV submission)

P-Tuning v2: Prompt Tuning Can Be Comparable to Finetuning Universally Across Scales and Tasks

Instance Semantic Segmentation List

Code repo for EMNLP21 paper "Zero-Shot Information Extraction as a Unified Text-to-Triple Translation"

A Planar RGB-D SLAM which utilizes Manhattan World structure to provide optimal camera pose trajectory while also providing a sparse reconstruction containing points, lines and planes, and a dense surfel-based reconstruction.

Boosted CVaR Classification (NeurIPS 2021)

Related tags

Overview

Boosted CVaR Classification

Table of Contents

Quick Start

Train

Evaluation

Introduction

Algorithms

Parameters

Citation and Contact

Owner

Runtian Zhai

A PyTorch implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis

A Simple Key-Value Data-store written in Python

Measuring if attention is explanation with ROAR

The personal repository of the work: *DanceNet3D: Music Based Dance Generation with Parametric Motion Transformer*.

Writeups for the challenges from DownUnderCTF 2021

Official implementation of paper "Query2Label: A Simple Transformer Way to Multi-Label Classification".

Vector AI — A platform for building vector based applications. Encode, query and analyse data using vectors.

An essential implementation of BYOL in PyTorch + PyTorch Lightning

OCR Post Correction for Endangered Language Texts

Efficient Sharpness-aware Minimization for Improved Training of Neural Networks

Real-time face detection and emotion/gender classification using fer2013/imdb datasets with a keras CNN model and openCV.

Rotation-Only Bundle Adjustment

Happywhale - Whale and Dolphin Identification Silver🥈 Solution (26/1588)

Experimental solutions to selected exercises from the book [Advances in Financial Machine Learning by Marcos Lopez De Prado]

Keyword2Text This repository contains the code of the paper: "A Plug-and-Play Method for Controlled Text Generation"

A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains (IJCV submission)

P-Tuning v2: Prompt Tuning Can Be Comparable to Finetuning Universally Across Scales and Tasks

Instance Semantic Segmentation List

Code repo for EMNLP21 paper "Zero-Shot Information Extraction as a Unified Text-to-Triple Translation"

A Planar RGB-D SLAM which utilizes Manhattan World structure to provide optimal camera pose trajectory while also providing a sparse reconstruction containing points, lines and planes, and a dense surfel-based reconstruction.

The personal repository of the work: DanceNet3D: Music Based Dance Generation with Parametric Motion Transformer.