ColossalAI-Benchmark - Performance benchmarking with ColossalAI

Overview

Benchmark for Tuning Accuracy and Efficiency

Overview

The benchmark includes our efforts in using Colossal-AI to train different tasks to achieve SOTA results. We are interested in both validataion accuracy and training speed, and prefer larger batch size to take advantage of more GPU devices. For example, we trained vision transformer with batch size 512 on CIFAR10 and 4096 on ImageNet1k, which are basically not used in existing works. Some of the results in the benchmark trained with 8x A100 are shown below.

Task Model Training Time Top-1 Accuracy
CIFAR10 ViT-Lite-7/4 ~ 16 min ~ 90.5%
ImageNet1k ViT-S/16 ~ 16.5 h ~ 74.5%

The train.py script in each task runs training with the specific configuration script in configs/ for different parallelisms. Supported parallelisms include data parallel only (ends with vanilla), 1D (ends with 1d), 2D (ends with 2d), 2.5D (ends with 2p5d), 3D (ends with 3d).

Each configuration scripts basically includes the following elements, taking ImageNet1k task as example:

TOTAL_BATCH_SIZE = 4096
LEARNING_RATE = 3e-3
WEIGHT_DECAY = 0.3

NUM_EPOCHS = 300
WARMUP_EPOCHS = 32

# data parallel only
TENSOR_PARALLEL_SIZE = 1    
TENSOR_PARALLEL_MODE = None

# parallelism setting
parallel = dict(
    pipeline=1,
    tensor=dict(mode=TENSOR_PARALLEL_MODE, size=TENSOR_PARALLEL_SIZE),
)

fp16 = dict(mode=AMP_TYPE.TORCH, ) # amp setting

gradient_accumulation = 2 # accumulate 2 steps for gradient update

BATCH_SIZE = TOTAL_BATCH_SIZE // gradient_accumulation # actual batch size for dataloader

clip_grad_norm = 1.0 # clip gradient with norm 1.0

Upper case elements are basically what train.py needs, and lower case elements are what Colossal-AI needs to initialize the training.

Usage

To start training, use the following command to run each worker:

$ DATA=/path/to/dataset python train.py --world_size=WORLD_SIZE \
                                        --rank=RANK \
                                        --local_rank=LOCAL_RANK \
                                        --host=MASTER_IP_ADDRESS \
                                        --port=MASTER_PORT \
                                        --config=CONFIG_FILE

It is also recommended to start training with torchrun as:

$ DATA=/path/to/dataset torchrun --nproc_per_node=NUM_GPUS_PER_NODE \
                                 --nnodes=NUM_NODES \
                                 --node_rank=NODE_RANK \
                                 --master_addr=MASTER_IP_ADDRESS \
                                 --master_port=MASTER_PORT \
                                 train.py --config=CONFIG_FILE
Owner
HPC-AI Tech
We are a global team to help you train and deploy your AI models
HPC-AI Tech
This repository is the official implementation of Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regularized Fine-Tuning (NeurIPS21).

Core-tuning This repository is the official implementation of ``Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regular

vanint 18 Dec 17, 2022
Deep Learning Based EDM Subgenre Classification using Mel-Spectrogram and Tempogram Features"

EDM-subgenre-classifier This repository contains the code for "Deep Learning Based EDM Subgenre Classification using Mel-Spectrogram and Tempogram Fea

11 Dec 20, 2022
Tensorflow implementation of "Learning Deep Features for Discriminative Localization"

Weakly_detector Tensorflow implementation of "Learning Deep Features for Discriminative Localization" B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and

Taeksoo Kim 363 Jun 29, 2022
Code and datasets for the paper "KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction"

KnowPrompt Code and datasets for our paper "KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction" Requireme

ZJUNLP 137 Dec 31, 2022
Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation.

============================================================================================================ `MILA will stop developing Theano https:

9.6k Dec 31, 2022
Discovering Explanatory Sentences in Legal Case Decisions Using Pre-trained Language Models.

Statutory Interpretation Data Set This repository contains the data set created for the following research papers: Savelka, Jaromir, and Kevin D. Ashl

17 Dec 23, 2022
Tensorflow 2 implementation of our high quality frame interpolation neural network

FILM: Frame Interpolation for Large Scene Motion Project | Paper | YouTube | Benchmark Scores Tensorflow 2 implementation of our high quality frame in

Google Research 1.6k Dec 28, 2022
Pytorch reimplementation of PSM-Net: "Pyramid Stereo Matching Network"

This is a Pytorch Lightning version PSMNet which is based on JiaRenChang/PSMNet. use python main.py to start training. PSM-Net Pytorch reimplementatio

XIAOTIAN LIU 1 Nov 25, 2021
Implementation of popular SOTA self-supervised learning algorithms as Fastai Callbacks.

Self Supervised Learning with Fastai Implementation of popular SOTA self-supervised learning algorithms as Fastai Callbacks. Install pip install self-

Kerem Turgutlu 276 Dec 23, 2022
Tensorflow implementation for "Improved Transformer for High-Resolution GANs" (NeurIPS 2021).

HiT-GAN Official TensorFlow Implementation HiT-GAN presents a Transformer-based generator that is trained based on Generative Adversarial Networks (GA

Google Research 78 Oct 31, 2022
PyTorch implementation of Histogram Layers from DeepHist: Differentiable Joint and Color Histogram Layers for Image-to-Image Translation

deep-hist PyTorch implementation of Histogram Layers from DeepHist: Differentiable Joint and Color Histogram Layers for Image-to-Image Translation PyT

Winfried Lötzsch 10 Dec 06, 2022
Parallel and High-Fidelity Text-to-Lip Generation; AAAI 2022 ; Official code

Parallel and High-Fidelity Text-to-Lip Generation This repository is the official PyTorch implementation of our AAAI-2022 paper, in which we propose P

Zhying 77 Dec 21, 2022
Deep universal probabilistic programming with Python and PyTorch

Getting Started | Documentation | Community | Contributing Pyro is a flexible, scalable deep probabilistic programming library built on PyTorch. Notab

7.7k Dec 30, 2022
Predicting a person's gender based on their weight and height

Logistic Regression Advanced Case Study Gender Classification: Predicting a person's gender based on their weight and height 1. Introduction We turn o

1 Feb 01, 2022
A simple but complete full-attention transformer with a set of promising experimental features from various papers

x-transformers A concise but fully-featured transformer, complete with a set of promising experimental features from various papers. Install $ pip ins

Phil Wang 2.3k Jan 03, 2023
CUP-DNN is a deep neural network model used to predict tissues of origin for cancers of unknown of primary.

CUP-DNN CUP-DNN is a deep neural network model used to predict tissues of origin for cancers of unknown of primary. The model was trained on the expre

1 Oct 27, 2021
CMUA-Watermark: A Cross-Model Universal Adversarial Watermark for Combating Deepfakes (AAAI2022)

CMUA-Watermark The official code for CMUA-Watermark: A Cross-Model Universal Adversarial Watermark for Combating Deepfakes (AAAI2022) arxiv. It is bas

50 Nov 26, 2022
The repository contains source code and models to use PixelNet architecture used for various pixel-level tasks. More details can be accessed at .

PixelNet: Representation of the pixels, by the pixels, and for the pixels. We explore design principles for general pixel-level prediction problems, f

Aayush Bansal 196 Aug 10, 2022
Weakly Supervised Scene Text Detection using Deep Reinforcement Learning

Weakly Supervised Scene Text Detection using Deep Reinforcement Learning This repository contains the setup for all experiments performed in our Paper

Emanuel Metzenthin 3 Dec 16, 2022
Data Augmentation Using Keras and Python

Data-Augmentation-Using-Keras-and-Python Data augmentation is the process of increasing the number of training dataset. Keras library offers a simple

Happy N. Monday 3 Feb 15, 2022