The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

Overview


Copyright © German Cancer Research Center (DKFZ), Division of Medical Image Computing (MIC). Please make sure that your usage of this code is in compliance with the code license.

Overview

This is a comprehensive framework for object detection featuring:

  • 2D + 3D implementations of prevalent object detectors: e.g. Mask R-CNN [1], Retina Net [2], Retina U-Net [3].
  • Modular and light-weight structure ensuring sharing of all processing steps (incl. backbone architecture) for comparability of models.
  • training with bounding box and/or pixel-wise annotations.
  • dynamic patching and tiling of 2D + 3D images (for training and inference).
  • weighted consolidation of box predictions across patch-overlaps, ensembles, and dimensions [3].
  • monitoring + evaluation simultaneously on object and patient level.
  • 2D + 3D output visualizations.
  • integration of COCO mean average precision metric [5].
  • integration of MIC-DKFZ batch generators for extensive data augmentation [6].
  • easy modification to evaluation of instance segmentation and/or semantic segmentation.

[1] He, Kaiming, et al. "Mask R-CNN" ICCV, 2017
[2] Lin, Tsung-Yi, et al. "Focal Loss for Dense Object Detection" TPAMI, 2018.
[3] Jaeger, Paul et al. "Retina U-Net: Embarrassingly Simple Exploitation of Segmentation Supervision for Medical Object Detection" , 2018

[5] https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocotools/cocoeval.py
[6] https://github.com/MIC-DKFZ/batchgenerators

How to cite this code

Please cite the original publication [3].

Installation

Setup package in a virtual environment:

git clone https://github.com/pfjaeger/medicaldetectiontoolkit.git .
cd medicaldetectiontoolkit
virtualenv -p python3.6 venv
source venv/bin/activate
pip3 install -e .

We use two cuda functions: Non-Maximum Suppression (taken from pytorch-faster-rcnn and added adaption for 3D) and RoiAlign (taken from RoiAlign, fixed according to this bug report, and added adaption for 3D). In this framework, they come pre-compile for TitanX. If you have a different GPU you need to re-compile these functions:

GPU arch
TitanX sm_52
GTX 960M sm_50
GTX 1070 sm_61
GTX 1080 (Ti) sm_61
cd cuda_functions/nms_xD/src/cuda/
nvcc -c -o nms_kernel.cu.o nms_kernel.cu -x cu -Xcompiler -fPIC -arch=[arch]
cd ../../
python build.py
cd ../

cd cuda_functions/roi_align_xD/roi_align/src/cuda/
nvcc -c -o crop_and_resize_kernel.cu.o crop_and_resize_kernel.cu -x cu -Xcompiler -fPIC -arch=[arch]
cd ../../
python build.py
cd ../../

Prepare the Data

This framework is meant for you to be able to train models on your own data sets. Two example data loaders are provided in medicaldetectiontoolkit/experiments including thorough documentation to ensure a quick start for your own project. The way I load Data is to have a preprocessing script, which after preprocessing saves the Data of whatever data type into numpy arrays (this is just run once). During training / testing, the data loader then loads these numpy arrays dynamically. (Please note the Data Input side is meant to be customized by you according to your own needs and the provided Data loaders are merely examples: LIDC has a powerful Dataloader that handles 2D/3D inputs and is optimized for patch-based training and inference. Toy-Experiments have a lightweight Dataloader, only handling 2D without patching. The latter makes sense if you want to get familiar with the framework.).

Execute

  1. Set I/O paths, model and training specifics in the configs file: medicaldetectiontoolkit/experiments/your_experiment/configs.py

  2. Train the model:

    python exec.py --mode train --exp_source experiments/my_experiment --exp_dir path/to/experiment/directory       
    

    This copies snapshots of configs and model to the specified exp_dir, where all outputs will be saved. By default, the data is split into 60% training and 20% validation and 20% testing data to perform a 5-fold cross validation (can be changed to hold-out test set in configs) and all folds will be trained iteratively. In order to train a single fold, specify it using the folds arg:

    python exec.py --folds 0 1 2 .... # specify any combination of folds [0-4]
    
  3. Run inference:

    python exec.py --mode test --exp_dir path/to/experiment/directory 
    

    This runs the prediction pipeline and saves all results to exp_dir.

Models

This framework features all models explored in [3] (implemented in 2D + 3D): The proposed Retina U-Net, a simple but effective Architecture fusing state-of-the-art semantic segmentation with object detection,


also implementations of prevalent object detectors, such as Mask R-CNN, Faster R-CNN+ (Faster R-CNN w\ RoIAlign), Retina Net, U-Faster R-CNN+ (the two stage counterpart of Retina U-Net: Faster R-CNN with auxiliary semantic segmentation), DetU-Net (a U-Net like segmentation architecture with heuristics for object detection.)



Training annotations

This framework features training with pixelwise and/or bounding box annotations. To overcome the issue of box coordinates in data augmentation, we feed the annotation masks through data augmentation (create a pseudo mask, if only bounding box annotations provided) and draw the boxes afterwards.


The framework further handles two types of pixel-wise annotations:

  1. A label map with individual ROIs identified by increasing label values, accompanied by a vector containing in each position the class target for the lesion with the corresponding label (for this mode set get_rois_from_seg_flag = False when calling ConvertSegToBoundingBoxCoordinates in your Data Loader).
  2. A binary label map. There is only one foreground class and single lesions are not identified. All lesions have the same class target (foreground). In this case the Dataloader runs a Connected Component Labelling algorithm to create processable lesion - class target pairs on the fly (for this mode set get_rois_from_seg_flag = True when calling ConvertSegToBoundingBoxCoordinates in your Data Loader).

Prediction pipeline

This framework provides an inference module, which automatically handles patching of inputs, and tiling, ensembling, and weighted consolidation of output predictions:




Consolidation of predictions (Weighted Box Clustering)

Multiple predictions of the same image (from test time augmentations, tested epochs and overlapping patches), result in a high amount of boxes (or cubes), which need to be consolidated. In semantic segmentation, the final output would typically be obtained by averaging every pixel over all predictions. As described in [3], weighted box clustering (WBC) does this for box predictions:





Visualization / Monitoring

By default, loss functions and performance metrics are monitored:




Histograms of matched output predictions for training/validation/testing are plotted per foreground class:



Input images + ground truth annotations + output predictions of a sampled validation abtch are plotted after each epoch (here 2D sampled slice with +-3 neighbouring context slices in channels):



Zoomed into the last two lines of the plot:


License

This framework is published under the Apache License Version 2.0.

Owner
MIC-DKFZ
Division of Medical Image Computing, German Cancer Research Center (DKFZ)
MIC-DKFZ
Code for the paper "Improving Vision-and-Language Navigation with Image-Text Pairs from the Web" (ECCV 2020)

Improving Vision-and-Language Navigation with Image-Text Pairs from the Web Arjun Majumdar, Ayush Shrivastava, Stefan Lee, Peter Anderson, Devi Parikh

Arjun Majumdar 44 Dec 14, 2022
Public implementation of "Learning from Suboptimal Demonstration via Self-Supervised Reward Regression" from CoRL'21

Self-Supervised Reward Regression (SSRR) Codebase for CoRL 2021 paper "Learning from Suboptimal Demonstration via Self-Supervised Reward Regression "

19 Dec 12, 2022
基于PaddleClas实现垃圾分类,并转换为inference格式用PaddleHub服务端部署

百度网盘链接及提取码: 链接:https://pan.baidu.com/s/1HKpgakNx1hNlOuZJuW6T1w 提取码:wylx 一个垃圾分类项目带你玩转飞桨多个产品(1) 基于PaddleClas实现垃圾分类,导出inference模型并利用PaddleHub Serving进行服务

thomas-yanxin 22 Jul 12, 2022
BEAS: Blockchain Enabled Asynchronous & Secure Federated Machine Learning

BEAS Blockchain Enabled Asynchronous and Secure Federated Machine Learning Default Network Configuration: The default application uses the HyperLedger

Harpreet Virk 11 Nov 20, 2022
Generative Query Network (GQN) in PyTorch as described in "Neural Scene Representation and Rendering"

Update 2019/06/24: A model trained on 10% of the Shepard-Metzler dataset has been added, the following notebook explains the main features of this mod

Jesper Wohlert 313 Dec 27, 2022
The official implementation code of "PlantStereo: A Stereo Matching Benchmark for Plant Surface Dense Reconstruction."

PlantStereo This is the official implementation code for the paper "PlantStereo: A Stereo Matching Benchmark for Plant Surface Dense Reconstruction".

Wang Qingyu 14 Nov 28, 2022
Understanding Hyperdimensional Computing for Parallel Single-Pass Learning

Understanding Hyperdimensional Computing for Parallel Single-Pass Learning Authors: Tao Yu* Yichi Zhang* Zhiru Zhang Christopher De Sa *: Equal Contri

Cornell RelaxML 4 Sep 08, 2022
Blender add-on: Add to Cameras menu: View → Camera, View → Add Camera, Camera → View, Previous Camera, Next Camera

Blender add-on: Camera additions In 3D view, it adds these actions to the View|Cameras menu: View → Camera : set the current camera to the 3D view Vie

German Bauer 11 Feb 08, 2022
A Confidence-based Iterative Solver of Depths and Surface Normals for Deep Multi-view Stereo

idn-solver Paper | Project Page This repository contains the code release of our ICCV 2021 paper: A Confidence-based Iterative Solver of Depths and Su

zhaowang 43 Nov 17, 2022
A PyTorch re-implementation of Neural Radiance Fields

nerf-pytorch A PyTorch re-implementation Project | Video | Paper NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis Ben Mildenhall

Krishna Murthy 709 Jan 09, 2023
EfficientNetV2 implementation using PyTorch

EfficientNetV2-S implementation using PyTorch Train Steps Configure imagenet path by changing data_dir in train.py python main.py --benchmark for mode

Jahongir Yunusov 86 Dec 29, 2022
A Fast Monotone Rotating Shallow Water model

pyRSW A Fast Monotone Rotating Shallow Water model How fast? As fast as a sustained 2 Gflop/s per core on a 2.5 GHz cpu (or 2048 Gflop/s with 1024 cor

Guillaume Roullet 13 Sep 28, 2022
Notes, programming assignments and quizzes from all courses within the Coursera Deep Learning specialization offered by deeplearning.ai

Coursera-deep-learning-specialization - Notes, programming assignments and quizzes from all courses within the Coursera Deep Learning specialization offered by deeplearning.ai: (i) Neural Networks an

Aman Chadha 1.7k Jan 08, 2023
Code repo for "FASA: Feature Augmentation and Sampling Adaptation for Long-Tailed Instance Segmentation" (ICCV 2021)

FASA: Feature Augmentation and Sampling Adaptation for Long-Tailed Instance Segmentation (ICCV 2021) This repository contains the implementation of th

Yuhang Zang 21 Dec 17, 2022
The source code of CVPR 2019 paper "Deep Exemplar-based Video Colorization".

Deep Exemplar-based Video Colorization (Pytorch Implementation) Paper | Pretrained Model | Youtube video 🔥 | Colab demo Deep Exemplar-based Video Col

Bo Zhang 253 Dec 27, 2022
A script helps the user to update Linux and Mac systems through the terminal

Description This script helps the user to update Linux and Mac systems through the terminal. All the user has to install some requirements and then ru

Roxcoder 2 Jan 23, 2022
Source code for Transformer-based Multi-task Learning for Disaster Tweet Categorisation (UCD's participation in TREC-IS 2020A, 2020B and 2021A).

Source code for "UCD participation in TREC-IS 2020A, 2020B and 2021A". *** update at: 2021/05/25 This repo so far relates to the following work: Trans

Congcong Wang 4 Oct 19, 2021
This is the repo for the paper "Improving the Accuracy-Memory Trade-Off of Random Forests Via Leaf-Refinement".

Improving the Accuracy-Memory Trade-Off of Random Forests Via Leaf-Refinement This is the repository for the paper "Improving the Accuracy-Memory Trad

3 Dec 29, 2022
Attentive Implicit Representation Networks (AIR-Nets)

Attentive Implicit Representation Networks (AIR-Nets) Preprint | Supplementary | Accepted at the International Conference on 3D Vision (3DV) teaser.mo

29 Dec 07, 2022
OBG-FCN - implementation of 'Object Boundary Guided Semantic Segmentation'

OBG-FCN This repository is to reproduce the implementation of 'Object Boundary Guided Semantic Segmentation' in http://arxiv.org/abs/1603.09742 Object

Jiu XU 3 Mar 11, 2019