A simple, unofficial implementation of MAE using pytorch-lightning

Last update: Dec 03, 2022

Related tags

Deep Learning mae-pytorch

Overview

Masked Autoencoders in PyTorch

A simple, unofficial implementation of MAE (Masked Autoencoders are Scalable Vision Learners) using pytorch-lightning.

Currently implements training on CUB and StanfordCars, but is easily extensible to any other image dataset.

Setup

.env">

# Clone the repository
git clone https://github.com/catalys1/mae-pytorch.git
cd mae-pytorch

# Install required libraries (inside a virtual environment preferably)
pip install -r requirements.txt

# Set up .env for path to data
echo "DATADIR=/path/to/data" > .env

Usage

MAE training

Training options are provided through configuration files, handled by LightningCLI. See configs/ for examples.

Train an MAE model on the CUB dataset:

python train.py fit --config=configs/mae.yaml --config=configs/data/cub_mae.yaml

Using multiple GPUs:

python train.py fit --config=configs/mae.yaml --config=configs/data/cub_mae.yaml --config=configs/multigpu.yaml

Fine-tuning

Not yet implemented.

Implementation

The default model uses ViT-Base for the encoder, and a small ViT (depth=4, width=192) for the decoder. This is smaller than the model used in the paper.

Dependencies

Configuration and training is handled completely by pytorch-lightning.
The MAE model uses the VisionTransformer from timm.
Interface to FGVC datasets through fgvcdata.
Configurable environment variables through python-dotenv.

Results

Image reconstructions of CUB validation set images after training with the following command:

python train.py fit --config=configs/mae.yaml --config=configs/data/cub_mae.yaml --config=configs/multigpu.yaml

A simple, unofficial implementation of MAE using pytorch-lightning

Related tags

Overview

Masked Autoencoders in PyTorch

Setup

Usage

MAE training

Fine-tuning

Implementation

Dependencies

Results

Owner

Connor Anderson

an Evolutionary Algorithm assisted GAN

Jingju baseline - A baseline model of our project of Beijing opera script generation

Dynamica causal Bayesian optimisation

LaneDet is an open source lane detection toolbox based on PyTorch that aims to pull together a wide variety of state-of-the-art lane detection models

PFLD pytorch Implementation

This is the workbook I created while I was studying for the Qiskit Associate Developer exam. I hope this becomes useful to others as it was for me :)

Official PaddlePaddle implementation of Paint Transformer

Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)

[SIGGRAPH 2021 Asia] DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning

Hand gesture recognition model that can be used as a remote control for a smart tv.

Classifying cat and dog images using Kaggle dataset

A knowledge base construction engine for richly formatted data

Code for ACL2021 long paper: Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases

External Attention Network

Github project for Attention-guided Temporal Coherent Video Object Matting.

🔥3D-RecGAN in Tensorflow (ICCV Workshops 2017)

Machine Learning toolbox for Humans

GNN4Traffic - This is the repository for the collection of Graph Neural Network for Traffic Forecasting

This script runs neural style transfer against the provided content image.

This is the code of paper ``Contrastive Coding for Active Learning under Class Distribution Mismatch'' with python.