A simple approach to emable dense segmentation with ViT.

Last update: Jan 03, 2023

Related tags

Overview

Vision Transformer Segmentation Network

This implementation of ViT in pytorch uses a super simple and straight-forward way of generating an output of the same size as the input by applying the inverse rearrange operation on all the predicted outputs. This enables convolution-free multi-class segmentation.

Most of the code is taken from https://github.com/lucidrains/vit-pytorch/blob/main/vit_pytorch/vit.py

Default Architecture Parameters:

model = ViTSeg( image_size=112, 
                channels=1,
                patch_size=7, 
                num_classes=1, 
                dim=768, 
                depth=6, 
                heads=12, 
                mlp_dim=2048, 
                learned_pos=False, 
                use_token=False)

image_size: An integer or a tuple defining the size of the input image (some code rewrite would enable any image size to be passed)
channels: An integer defining the umber of channels in the input image
patch_size: An integer or a tuple defining the size of the patches
num_classes: An integer representing the nuber of channels in the ouput
dim: An integer defining the size of the embedding dimension
depth: An integer defining the number of transformer layers
heads: An integer defining the number of heads in the transformer layers
mlp_dim: An integer defining the size of the MLP in the transformer layers
learned_pos: A boolean which, if true, switches from fixed positional encoding to learned positional encodings
use_token: A boolean which, if true, add a CLS token in the input and output

Citation

If you find this repository useful, please consider citing it:

@article{reynaud2021vitseg,
  title={ViTSeg-https://github.com/HReynaud/ViTSeg}, 
  url={https://github.com/HReynaud/ViTSeg},  
  Author={Reynaud, Hadrien}, 
  Year={2021}
}

A simple approach to emable dense segmentation with ViT.

Related tags

Overview

Vision Transformer Segmentation Network

Default Architecture Parameters:

Citation

Owner

HReynaud

FishNet: One Stage to Detect, Segmentation and Pose Estimation

Inference pipeline for our participation in the FeTA challenge 2021.

Deep-Learning-Book-Chapter-Summaries - Attempting to make the Deep Learning Book easier to understand.

Easy-to-use,Modular and Extendible package of deep-learning based CTR models .

Algorithm to texture 3D reconstructions from multi-view stereo images

v objective diffusion inference code for PyTorch.

ObjDetApp deploys a pytorch model for object detection

coldcuts is an R package to automatically generate and plot segmentation drawings in R

Deep High-Resolution Representation Learning for Human Pose Estimation

An implementation for `Text2Event: Controllable Sequence-to-Structure Generation for End-to-end Event Extraction`

Official code of paper: MovingFashion: a Benchmark for the Video-to-Shop Challenge

This is a Deep Leaning API for classifying emotions from human face and human audios.

Source code of CIKM2021 Long Paper "PSSL: Self-supervised Learning for Personalized Search with Contrastive Sampling".

Dynamic Graph Event Detection

PyToch implementation of A Novel Self-supervised Learning Task Designed for Anomaly Segmentation

A robust camera and Lidar fusion based velocity estimator to undistort the pointcloud.

A curated list of the latest breakthroughs in AI (in 2021) by release date with a clear video explanation, link to a more in-depth article, and code.

Prometheus exporter for Cisco Unified Computing System (UCS) Manager

App customer segmentation cohort rfm clustering

An Intelligent Self-driving Truck System For Highway Transportation

A simple approach to emable dense segmentation with ViT.

Related tags

Overview

Vision Transformer Segmentation Network

Default Architecture Parameters:

Citation

Owner

HReynaud

FishNet: One Stage to Detect, Segmentation and Pose Estimation

Inference pipeline for our participation in the FeTA challenge 2021.

Deep-Learning-Book-Chapter-Summaries - Attempting to make the Deep Learning Book easier to understand.

Easy-to-use,Modular and Extendible package of deep-learning based CTR models .

Algorithm to texture 3D reconstructions from multi-view stereo images

v objective diffusion inference code for PyTorch.

*ObjDetApp* deploys a pytorch model for object detection

coldcuts is an R package to automatically generate and plot segmentation drawings in R

Deep High-Resolution Representation Learning for Human Pose Estimation

An implementation for `Text2Event: Controllable Sequence-to-Structure Generation for End-to-end Event Extraction`

Official code of paper: MovingFashion: a Benchmark for the Video-to-Shop Challenge

This is a Deep Leaning API for classifying emotions from human face and human audios.

Source code of CIKM2021 Long Paper "PSSL: Self-supervised Learning for Personalized Search with Contrastive Sampling".

Dynamic Graph Event Detection

PyToch implementation of A Novel Self-supervised Learning Task Designed for Anomaly Segmentation

A robust camera and Lidar fusion based velocity estimator to undistort the pointcloud.

A curated list of the latest breakthroughs in AI (in 2021) by release date with a clear video explanation, link to a more in-depth article, and code.

Prometheus exporter for Cisco Unified Computing System (UCS) Manager

App customer segmentation cohort rfm clustering

An Intelligent Self-driving Truck System For Highway Transportation

ObjDetApp deploys a pytorch model for object detection