MLP-Mixer: An all-MLP Architecture for Vision

This repo contains PyTorch implementation of MLP-Mixer: An all-MLP Architecture for Vision.

Usage :

import torch
import numpy as np
from mlp-mixer import MLPMixer

img = torch.ones([1, 3, 224, 224])

model = MLPMixer(in_channels=3, image_size=224, patch_size=16, num_classes=1000,
                 dim=512, depth=8, token_dim=256, channel_dim=2048)

parameters = filter(lambda p: p.requires_grad, model.parameters())
parameters = sum([np.prod(p.size()) for p in parameters]) / 1_000_000
print('Trainable Parameters: %.3fM' % parameters)

out_img = model(img)

print("Shape of out :", out_img.shape)  # [B, in_channels, image_size, image_size]

Citation :

@misc{tolstikhin2021mlpmixer,
      title={MLP-Mixer: An all-MLP Architecture for Vision}, 
      author={Ilya Tolstikhin and Neil Houlsby and Alexander Kolesnikov and Lucas Beyer and Xiaohua Zhai and Thomas Unterthiner and Jessica Yung and Daniel Keysers and Jakob Uszkoreit and Mario Lucic and Alexey Dosovitskiy},
      year={2021},
      eprint={2105.01601},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgement :

Some component borrowed from ViT code of @lucidrains repo : https://github.com/lucidrains/vit-pytorch

Unofficial implementation of MLP-Mixer: An all-MLP Architecture for Vision

Related tags

Overview

MLP-Mixer: An all-MLP Architecture for Vision

Usage :

Citation :

Acknowledgement :

Owner

Rishikesh (ऋषिकेश)

Keyword-BERT: Keyword-Attentive Deep Semantic Matching

Guiding evolutionary strategies by (inaccurate) differentiable robot simulators @ NeurIPS, 4th Robot Learning Workshop

Code I use to automatically update my videos' metadata on YouTube

The repo for the paper "I3CL: Intra- and Inter-Instance Collaborative Learning for Arbitrary-shaped Scene Text Detection".

Code for Universal Semi-Supervised Semantic Segmentation models paper accepted in ICCV 2019

Official implementation of CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification

An educational AI robot based on NVIDIA Jetson Nano.

PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech

Pytorch implementation of COIN, a framework for compression with implicit neural representations 🌸

Diffusion Probabilistic Models for 3D Point Cloud Generation (CVPR 2021)

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

Short and long time series classification using convolutional neural networks

Repository for the paper titled: "When is BERT Multilingual? Isolating Crucial Ingredients for Cross-lingual Transfer"

Main repository for the HackBio'2021 Virtual Internship Experience for #Team-Greider ❤️

Implementation of "A MLP-like Architecture for Dense Prediction"

TeachMyAgent is a testbed platform for Automatic Curriculum Learning methods in Deep RL.

This repository contains the official implementation code of the paper Transformer-based Feature Reconstruction Network for Robust Multimodal Sentiment Analysis

[ACL-IJCNLP 2021] Improving Named Entity Recognition by External Context Retrieving and Cooperative Learning

(under submission) Bayesian Integration of a Generative Prior for Image Restoration

Implementation of "DeepOrder: Deep Learning for Test Case Prioritization in Continuous Integration Testing".