Source code of AAAI 2022 paper "Towards End-to-End Image Compression and Analysis with Transformers".

Last update: Dec 21, 2022

Overview

Towards End-to-End Image Compression and Analysis with Transformers

Source code of our AAAI 2022 paper "Towards End-to-End Image Compression and Analysis with Transformers".

Usage

The code is run with Python 3.7, Pytorch 1.8.1, Timm 0.4.9 and Compressai 1.1.4.

Data preparation

Download and extract ImageNet train and val images from http://image-net.org/. The directory structure is the standard layout for the torchvision datasets.ImageFolder, and the training and validation data is expected to be in the train folder and val folder respectively:

/path/to/imagenet/
  train/
    class1/
      img1.jpeg
    class2/
      img2.jpeg
  val/
    class1/
      img3.jpeg
    class2/
      img4.jpeg

Pretrained model

The ./pretrained_model provides the pretrained model without compression.

Test

Please adjust --data-path and run sh test.sh:

python main.py --eval --resume ./pretrain_s/checkpoint.pth --model pretrained_model --data-path /path/to/imagenet/ --output_dir ./eval

The ./pretrain_s/checkpoint.pth can be downloaded from Baidu Netdisk, with access code aaai.

Train

Please adjust --data-path and run sh train.sh:

python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --model pretrained_model --no-model-ema --clip-grad 1.0 --batch-size 128 --num_workers 16 --data-path /path/to/imagenet/ --output_dir ./ckp_pretrain

Full model

The ./full_model provides the full model with compression.

Test

Please adjust --data-path and --resume, respectively. Run sh test.sh:

python main.py --eval --resume ./ckp_s_q1/checkpoint.pth --model full_model --no-pretrained --data-path /path/to/imagenet/ --output_dir ./eval

The ./ckp_s_q1/checkpoint.pth, ./ckp_s_q2/checkpoint.pth and ./ckp_s_q3/checkpoint.pth can be downloaded from Baidu Netdisk, with access code aaai.

Train

Please download ./pretrain_s/checkpoint.pth from Baidu Netdisk with access code aaai, adjust --data-path and --quality, respectively.

quality	alpha	beta
1	0.1	0.001
2	0.3	0.003
3	0.6	0.006

Run sh train.sh:

python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --model full_model --batch-size 128 --num_workers 16 --clip-grad 1.0 --quality 1 --data-path /path/to/imagenet/ --output_dir ./ckp_full

Citation

@InProceedings{Bai2022AAAI,
  title={Towards End-to-End Image Compression and Analysis with Transformers},
  author={Bai, Yuanchao and Yang, Xu and Liu, Xianming and Jiang, Junjun and Wang, Yaowei and Ji, Xiangyang and Gao, Wen},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  year={2022}
}

Source code of AAAI 2022 paper "Towards End-to-End Image Compression and Analysis with Transformers".

Related tags

Overview

Towards End-to-End Image Compression and Analysis with Transformers

Usage

Data preparation

Pretrained model

Full model

Citation

Owner

Source code for our EMNLP'21 paper 《Raise a Child in Large Language Model: Towards Effective and Generalizable Fine-tuning》

Interpolation-based reduced-order models

This repository is for Competition for ML_data class

Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm

Laplacian Score-regularized Concrete Autoencoders

Collection of machine learning related notebooks to share.

ACL'2021: LM-BFF: Better Few-shot Fine-tuning of Language Models

The story of Chicken for Club Bing

CSAC - Collaborative Semantic Aggregation and Calibration for Separated Domain Generalization

SEJE Pytorch implementation

PyTorch implementation of Pointnet2/Pointnet++

Deep learning for spiking neural networks

Official implementation for “Unsupervised Low-Light Image Enhancement via Histogram Equalization Prior”

This repository stores the code to reproduce the results published in "TiWS-iForest: Isolation Forest in Weakly Supervised and Tiny ML scenarios"

Cross-modal Deep Face Normals with Deactivable Skip Connections

LinkNet - This repository contains our Torch7 implementation of the network developed by us at e-Lab.

Predict multi paths to a moving person depending on his trajectory history.

U-Net implementation in PyTorch for FLAIR abnormality segmentation in brain MRI

This repository is for Contrastive Embedding Distribution Refinement and Entropy-Aware Attention Network (CEDR)

VolumeGAN - 3D-aware Image Synthesis via Learning Structural and Textural Representations