MODALS: Modality-agnostic Automated Data Augmentation in the Latent Space

Last update: Dec 15, 2022

Related tags

Deep Learning modals

Overview

Update (20 Jan 2020): MODALS on text data is avialable

MODALS

MODALS: Modality-agnostic Automated Data Augmentation in the Latent Space

Introduction
Getting Started
Run Search
Run Training
Citation

Introduction

MODALS is a framework to apply automated data augmentation to augment data for any modality in a generic way. It exploits automated data augmentation to fine-tune four universal data transformation operations in the latent space to adapt the transform to data of different modalities.

This repository contains code for the work "MODALS: Modality-agnostic Automated Data Augmentation in the Latent Space" (https://openreview.net/pdf?id=XjYgR6gbCEc) implemented using the PyTorch library. It includes searching and training of the SST2 and TREC6 datasets.

Getting Started

Code supports Python 3.

Install requirements

pip install -r requirements.txt

Setting up directory path

In modals/setup.py, specify the dataset path for DATA_DIR and the path to the directory that contains the glove embeddings for EMB_DIR.

Run MODALS search

Script to search for the augmentation policy for SST2 and TREC6 datasets is located in scripts/search.sh. Pass the dataset name as the arguement to call the script.

For example, to search for the augmentation policy for SST2 dataset:

bash scripts/search.sh sst2

The training log and candidate policies of the search will be output to the ./ray_experiments directory.

Run MODALS training

Two searched policy is included in the ./schedule directory. The script to apply the searched policy for training SST2 and TREC6 is located in scripts/train.sh. Pass the dataset name as the arguement to call the script.

bash scripts/train.sh sst2

Citation

If you use MODALS in your research, please cite:

@inproceedings{cheung2021modals,
  title     =  {{\{}MODALS{\}}: Modality-agnostic Automated Data Augmentation in the Latent Space},
  author    =  {Tsz-Him Cheung and Dit-Yan Yeung},
  booktitle =  {International Conference on Learning Representations},
  year      =  {2021},
  url       =  {https://openreview.net/forum?id=XjYgR6gbCEc}
}

MODALS: Modality-agnostic Automated Data Augmentation in the Latent Space

Related tags

Overview

MODALS

Table of Contents

Introduction

Getting Started

Install requirements

Setting up directory path

Run MODALS search

Run MODALS training

Citation

Owner

The PyTorch improved version of TPAMI 2017 paper: Face Alignment in Full Pose Range: A 3D Total Solution.

Using pretrained language models for biomedical knowledge graph completion.

Face-Recognition-based-Attendance-System - An implementation of Attendance System in python.

A PyTorch implementation of the continual learning experiments with deep neural networks

This is the 3D Implementation of 《Inconsistency-aware Uncertainty Estimation for Semi-supervised Medical Image Segmentation》

This is the dataset and code release of the OpenRooms Dataset.

Learning Features with Parameter-Free Layers (ICLR 2022)

A python library to artfully visualize Factorio Blueprints and an interactive web demo for using it.

Code release for ICCV 2021 paper "Anticipative Video Transformer"

Pytorch version of SfmLearner from Tinghui Zhou et al.

This project aims at building a real-time wide band channel sounder using USRPs

The Power of Scale for Parameter-Efficient Prompt Tuning

The implementation of ICASSP 2020 paper "Pixel-level self-paced learning for super-resolution"

A Data Annotation Tool for Semantic Segmentation, Object Detection and Lane Line Detection.(In Development Stage)

Code to reproduce the results for Statistically Robust Neural Network Classification, published in UAI 2021

End-To-End Crowdsourcing

FaceOcc: A Diverse, High-quality Face Occlusion Dataset for Human Face Extraction

PyTorch implementation of "Learn to Dance with AIST++: Music Conditioned 3D Dance Generation."

Collects many various multi-modal transformer architectures, including image transformer, video transformer, image-language transformer, video-language transformer and related datasets

Auxiliary Raw Net (ARawNet) is a ASVSpoof detection model taking both raw waveform and handcrafted features as inputs, to balance the trade-off between performance and model complexity.