Self-Supervised Vision Transformers Learn Visual Concepts in Histopathology (LMRL Workshop, NeurIPS 2021)

Overview

Self-Supervised Vision Transformers Learn Visual Concepts in Histopathology

Self-Supervised Vision Transformers Learn Visual Concepts in Histopathology, LMRL Workshop, NeurIPS 2021. [Workshop] [arXiv]
Richard. J. Chen, Rahul G. Krishnan
@article{chen2022self,
  title={Self-Supervised Vision Transformers Learn Visual Concepts in Histopathology},
  author={Chen, Richard J and Krishnan, Rahul G},
  journal={Learning Meaningful Representations of Life, NeurIPS 2021},
  year={2021}
}
DINO illustration

Summary / Main Findings:

  1. In head-to-head comparison of SimCLR versus DINO, DINO learns more effective pretrained representations for histopathology - likely due to 1) not needing negative samples (histopathology has lots of potential class imbalance), 2) capturing better inductive biases about the part-whole hierarchies of how cells are spatially organized in tissue.
  2. ImageNet features do lag behind SSL methods (in terms of data-efficiency), but are better than you think on patch/slide-level tasks. Transfer learning with ImageNet features (from a truncated ResNet-50 after 3rd residual block) gives very decent performance using the CLAM package.
  3. SSL may help mitigate domain shift from site-specific H&E stainining protocols. With vanilla data augmentations, global structure of morphological subtypes (within each class) are more well-preserved than ImageNet features via 2D UMAP scatter plots.
  4. Self-supervised ViTs are able to localize cell location quite well w/o any supervision. Our results show that ViTs are able to localize visual concepts in histopathology in introspecting the attention heads.

Updates

Stay tuned for more updates :).

  • TBA: Pretrained SimCLR and DINO models on TCGA-Lung (Larger working paper, in submission).
  • TBA: Pretrained SimCLR and DINO models on TCGA-PanCancer (Larger working paper, in submission).
  • TBA: PEP8-compliance (cleaning and organizing code).
  • 03/04/2022: Reproducible and largely-working codebase that I'm satisfied with and have heavily tested.

Pre-Reqs

We use Git LFS to version-control large files in this repository (e.g. - images, embeddings, checkpoints). After installing, to pull these large files, please run:

git lfs pull

Pretrained Models

SIMCLR and DINO models were trained for 100 epochs using their vanilla training recipes in their respective papers. These models were developed on 2,055,742 patches (256 x 256 resolution at 20X magnification) extracted from diagnostic slides in the TCGA-BRCA dataset, and evaluated via K-NN on patch-level datasets in histopathology.

Note: Results should be taken-in w.r.t. to the size of dataset and duraration of training epochs. Ideally, longer training with larger batch sizes would demonstrate larger gains in SSL performance.

Arch SSL Method Dataset Epochs Dim K-NN Download
ResNet-50 Transfer ImageNet N/A 1024 0.935 N/A
ResNet-50 SimCLR TCGA-BRCA 100 2048 0.938 Backbone
ViT-S/16 DINO TCGA-BRCA 100 384 0.941 Backbone

Data Download + Data Preprocessing

For CRC-100K and BreastPathQ, pre-extracted embeddings are already available and processed in ./embeddings_patch_library. See patch_extraction_utils.py on how these patch datasets were processed.

Additional Datasets + Custom Implementation: This codebase is flexible for feature extraction on a variety of different patch datasets. To extend this work, simply modify patch_extraction_utils.py with a custom Dataset Loader for your dataset. As an example, we include BCSS (results not yet updated in this work).

  • BCSS (v1): You can download the BCSS dataset from the official Grand Challenge link. For this dataset, we manually developed the train and test dataset splits and labels using majority-voting. Reproducibility for the raw BCSS dataset may be not exact, but we include the pre-extracted embeddings of this dataset in ./embeddings_patch_library (denoted as version 1).

Evaluation: K-NN Patch-Level Classification on CRC-100K + BreastPathQ

Run the notebook patch_extraction.ipynb, followed by patch_evaluation.ipynb. The evaluation notebook should run "out-of-the-box" with Git LFS.

table2

Evaluation: Slide-Level Classification on TCGA-BRCA (IDC versus ILC)

Install the CLAM Package, followed by using the 10-fold cross-validation splits made available in ./slide_evaluation/10foldcv_subtype/tcga_brca. Tensorboard train + validation logs can visualized via:

tensorboard --logdir ./slide_evaluation/results/
table1

Visualization: Creating UMAPs

Install umap-learn (can be tricky to install if you have incompatible dependencies), followed by using the following code snippet in patch_extraction_utils.py, and is used in patch_extraction.ipynb to create Figure 4.

UMAP

Visualization: Attention Maps

Attention visualizations (reproducing Figure 3) can be performed via walking through the following notebook at attention_visualization_256.ipynb.

Attention Visualization

Issues

  • Please open new threads or report issues directly (for urgent blockers) to [email protected].
  • Immediate response to minor issues may not be available.

Acknowledgements, License & Usage

  • Part of this work was performed while at Microsoft Research. We thank the BioML group at Microsoft Research New England for their insightful feedback.
  • This work is still under submission in a formal proceeding. Still, if you found our work useful in your research, please consider citing our paper at:
@article{chen2022self,
  title={Self-Supervised Vision Transformers Learn Visual Concepts in Histopathology},
  author={Chen, Richard J and Krishnan, Rahul G},
  journal={Learning Meaningful Representations of Life, NeurIPS 2021},
  year={2021}
}

© This code is made available under the GPLv3 License and is available for non-commercial academic purposes.

Owner
Richard Chen
Ph.D. Candidate at Harvard
Richard Chen
TorchXRayVision: A library of chest X-ray datasets and models.

torchxrayvision A library for chest X-ray datasets and models. Including pre-trained models. ( 🎬 promo video about the project) Motivation: While the

Machine Learning and Medicine Lab 575 Jan 08, 2023
Code for CVPR2019 paper《Unequal Training for Deep Face Recognition with Long Tailed Noisy Data》

Unequal-Training-for-Deep-Face-Recognition-with-Long-Tailed-Noisy-Data. This is the code of CVPR 2019 paper《Unequal Training for Deep Face Recognition

Zhong Yaoyao 68 Jan 07, 2023
Code for the paper "Graph Attention Tracking". (CVPR2021)

SiamGAT 1. Environment setup This code has been tested on Ubuntu 16.04, Python 3.5, Pytorch 1.2.0, CUDA 9.0. Please install related libraries before r

122 Dec 24, 2022
Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

Non-Rigid Neural Radiance Fields This is the official repository for the project "Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synt

Facebook Research 296 Dec 29, 2022
The open-source and free to use Python package miseval was developed to establish a standardized medical image segmentation evaluation procedure

miseval: a metric library for Medical Image Segmentation EVALuation The open-source and free to use Python package miseval was developed to establish

59 Dec 10, 2022
The official repository for paper ''Domain Generalization for Vision-based Driving Trajectory Generation'' submitted to ICRA 2022

DG-TrajGen The official repository for paper ''Domain Generalization for Vision-based Driving Trajectory Generation'' submitted to ICRA 2022. Our Meth

Wang 25 Sep 26, 2022
Real-time analysis of intracranial neurophysiology recordings.

py_neuromodulation Click this button to run the "Tutorial ML with py_neuro" notebooks: The py_neuromodulation toolbox allows for real time capable pro

Interventional Cognitive Neuromodulation - Neumann Lab Berlin 15 Nov 03, 2022
Zen-NAS: A Zero-Shot NAS for High-Performance Deep Image Recognition

Zen-NAS: A Zero-Shot NAS for High-Performance Deep Image Recognition How Fast Compare to Other Zero-Shot NAS Proxies on CIFAR-10/100 Pre-trained Model

190 Dec 29, 2022
The official codes of "Semi-supervised Models are Strong Unsupervised Domain Adaptation Learners".

SSL models are Strong UDA learners Introduction This is the official code of paper "Semi-supervised Models are Strong Unsupervised Domain Adaptation L

Yabin Zhang 26 Dec 26, 2022
Automatic differentiation with weighted finite-state transducers.

GTN: Automatic Differentiation with WFSTs Quickstart | Installation | Documentation What is GTN? GTN is a framework for automatic differentiation with

100 Dec 29, 2022
Code and hyperparameters for the paper "Generative Adversarial Networks"

Generative Adversarial Networks This repository contains the code and hyperparameters for the paper: "Generative Adversarial Networks." Ian J. Goodfel

Ian Goodfellow 3.5k Jan 08, 2023
The official implementation of paper Siamese Transformer Pyramid Networks for Real-Time UAV Tracking, accepted by WACV22

SiamTPN Introduction This is the official implementation of the SiamTPN (WACV2022). The tracker intergrates pyramid feature network and transformer in

Robotics and Intelligent Systems Control @ NYUAD 28 Nov 25, 2022
Official page of Struct-MDC (RA-L'22 with IROS'22 option); Depth completion from Visual-SLAM using point & line features

Struct-MDC (click the above buttons for redirection!) Official page of "Struct-MDC: Mesh-Refined Unsupervised Depth Completion Leveraging Structural R

Urban Robotics Lab. @ KAIST 37 Dec 22, 2022
C3D is a modified version of BVLC caffe to support 3D ConvNets.

C3D C3D is a modified version of BVLC caffe to support 3D convolution and pooling. The main supporting features include: Training or fine-tuning 3D Co

Meta Archive 1.1k Nov 14, 2022
Global-Local Context Network for Person Search

Global-Local Context Network for Person Search Abstract: Person search aims to jointly localize and identify a query person from natural, uncropped im

Peng Zheng 15 Oct 17, 2022
DEMix Layers for Modular Language Modeling

DEMix This repository contains modeling utilities for "DEMix Layers: Disentangling Domains for Modular Language Modeling" (Gururangan et. al, 2021). T

Suchin 43 Nov 11, 2022
Code for "Unsupervised State Representation Learning in Atari"

Unsupervised State Representation Learning in Atari Ankesh Anand*, Evan Racah*, Sherjil Ozair*, Yoshua Bengio, Marc-Alexandre Côté, R Devon Hjelm This

Mila 217 Jan 03, 2023
Memory efficient transducer loss computation

Introduction This project implements the optimization techniques proposed in Improving RNN Transducer Modeling for End-to-End Speech Recognition to re

Fangjun Kuang 51 Nov 25, 2022
A graph neural network (GNN) model to predict protein-protein interactions (PPI) with no sample features

A graph neural network (GNN) model to predict protein-protein interactions (PPI) with no sample features

2 Jul 25, 2022
Supervised domain-agnostic prediction framework for probabilistic modelling

A supervised domain-agnostic framework that allows for probabilistic modelling, namely the prediction of probability distributions for individual data

The Alan Turing Institute 112 Oct 23, 2022