The open-source and free to use Python package miseval was developed to establish a standardized medical image segmentation evaluation procedure

Overview

miseval: a metric library for Medical Image Segmentation EVALuation

shield_python shield_build shield_pypi_version shield_pypi_downloads shield_license

The open-source and free to use Python package miseval was developed to establish a standardized medical image segmentation evaluation procedure. We hope that our this will help improve evaluation quality, reproducibility, and comparability in future studies in the field of medical image segmentation.

Guideline on Evaluation Metrics for Medical Image Segmentation

  1. Use DSC as main metric for validation and performance interpretation.
  2. Use AHD for interpretation on point position sensitivity (contour) if needed.
  3. Avoid any interpretations based on high pixel accuracy scores.
  4. Provide next to DSC also IoU, Sensitivity, and Specificity for method comparability.
  5. Provide sample visualizations, comparing the annotated and predicted segmentation, for visual evaluation as well as to avoid statistical bias.
  6. Avoid cherry-picking high-scoring samples.
  7. Provide histograms or box plots showing the scoring distribution across the dataset.
  8. For multi-class problems, provide metric computations for each class individually.
  9. Avoid confirmation bias through macro-averaging classes which is pushing scores via background class inclusion.
  10. Provide access to evaluation scripts and results with journal data services or third-party services like GitHub and Zenodo for easier reproducibility.

Implemented Metrics

Metric Index in miseval Function in miseval
Dice Similarity Index "DSC", "Dice", "DiceSimilarityCoefficient" miseval.calc_DSC()
Intersection-Over-Union "IoU", "Jaccard", "IntersectionOverUnion" miseval.calc_IoU()
Sensitivity "SENS", "Sensitivity", "Recall", "TPR", "TruePositiveRate" miseval.calc_Sensitivity()
Specificity "SPEC", "Specificity", "TNR", "TrueNegativeRate" miseval.calc_Specificity()
Precision "PREC", "Precision" miseval.calc_Precision()
Accuracy "ACC", "Accuracy", "RI", "RandIndex" miseval.calc_Accuracy()
Balanced Accuracy "BACC", "BalancedAccuracy" miseval.calc_BalancedAccuracy()
Adjusted Rand Index "ARI", "AdjustedRandIndex" miseval.calc_AdjustedRandIndex()
AUC "AUC", "AUC_trapezoid" miseval.calc_AUC()
Cohen's Kappa "KAP", "Kappa", "CohensKappa" miseval.calc_Kappa()
Hausdorff Distance "HD", "HausdorffDistance" miseval.calc_SimpleHausdorffDistance()
Average Hausdorff Distance "AHD", "AverageHausdorffDistance" miseval.calc_AverageHausdorffDistance()
Volumetric Similarity "VS", "VolumetricSimilarity" miseval.calc_VolumetricSimilarity()
True Positive "TP", "TruePositive" miseval.calc_TruePositive()
False Positive "FP", "FalsePositive" miseval.calc_FalsePositive()
True Negative "TN", "TrueNegative" miseval.calc_TrueNegative()
False Negative "FN", "FalseNegative" miseval.calc_FalseNegative()

How to Use

Example

# load libraries
import numpy as np
from miseval import evaluate

# Get some ground truth / annotated segmentations
np.random.seed(1)
real_bi = np.random.randint(2, size=(64,64))  # binary (2 classes)
real_mc = np.random.randint(5, size=(64,64))  # multi-class (5 classes)
# Get some predicted segmentations
np.random.seed(2)
pred_bi = np.random.randint(2, size=(64,64))  # binary (2 classes)
pred_mc = np.random.randint(5, size=(64,64))  # multi-class (5 classes)

# Run binary evaluation
dice = evaluate(real_bi, pred_bi, metric="DSC")    
  # returns single np.float64 e.g. 0.75

# Run multi-class evaluation
dice_list = evaluate(real_mc, pred_mc, metric="DSC", multi_class=True,
                     n_classes=5)   
  # returns array of np.float64 e.g. [0.9, 0.2, 0.6, 0.0, 0.4]
  # for each class, one score

Core function: Evaluate()

Every metric in miseval can be called via our core function evaluate().

The miseval eavluate function can be run with different metrics as backbone.
You can pass the following options to the metric parameter:

  • String naming one of the metric labels, for example "DSC"
  • Directly passing a metric function, for example calc_DSC_Sets (from dice.py)
  • Passing a custom metric function

List of metrics : See miseval/__init__.py under section "Access Functions to Metric Functions"

The classes in a segmentation mask must be ongoing starting from 0 (integers from 0 to n_classes-1).

A segmentation mask is allowed to have either no channel axis or just 1 (e.g. 512x512x1), which contains the annotation.

Binary mode. n_classes (Integer): Number of classes. By default 2 -> Binary Output: score (Float) or scores (List of Float) The multi_class parameter defines the output of this function. If n_classes > 2, multi_class is automatically True. If multi_class == False & n_classes == 2, only a single score (float) is returned. If multi_class == True, multiple scores as a list are returned (for each class one score). """ def evaluate(truth, pred, metric, multi_class=False, n_classes=2)">
"""
Arguments:
    truth (NumPy Matrix):            Ground Truth segmentation mask.
    pred (NumPy Matrix):             Prediction segmentation mask.
    metric (String or Function):     Metric function. Either a function directly or encoded as String from miseval or a custom function.
    multi_class (Boolean):           Boolean parameter, if segmentation is a binary or multi-class problem. By default False -> Binary mode.
    n_classes (Integer):             Number of classes. By default 2 -> Binary

Output:
    score (Float) or scores (List of Float)

    The multi_class parameter defines the output of this function.
    If n_classes > 2, multi_class is automatically True.
    If multi_class == False & n_classes == 2, only a single score (float) is returned.
    If multi_class == True, multiple scores as a list are returned (for each class one score).
"""
def evaluate(truth, pred, metric, multi_class=False, n_classes=2)

Installation

  • Install miseval from PyPI (recommended):
pip install miseval
  • Alternatively: install miseval from the GitHub source:

First, clone miseval using git:

git clone https://github.com/frankkramer-lab/miseval

Then, go into the miseval folder and run the install command:

cd miseval
python setup.py install

Author

Dominik Müller
Email: [email protected]
IT-Infrastructure for Translational Medical Research
University Augsburg
Bavaria, Germany

How to cite / More information

Dominik Müller, Dennis Hartmann, Philip Meyer, Florian Auer, Iñaki Soto-Rey, Frank Kramer. (2022)
MISeval: a Metric Library for Medical Image Segmentation Evaluation.
arXiv e-print: https://arxiv.org/abs/2201.09395

@inproceedings{misevalMUELLER2022,
  title={MISeval: a Metric Library for Medical Image Segmentation Evaluation},
  author={Dominik Müller, Dennis Hartmann, Philip Meyer, Florian Auer, Iñaki Soto-Rey, Frank Kramer},
  year={2022}
  eprint={2201.09395},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

Thank you for citing our work.

License

This project is licensed under the GNU GENERAL PUBLIC LICENSE Version 3.
See the LICENSE.md file for license rights and limitations.

LoveDA: A Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation (NeurIPS2021 Benchmark and Dataset Track)

LoveDA: A Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation by Junjue Wang, Zhuo Zheng, Ailong Ma, Xiaoyan Lu, and Yanfei Zh

Kingdrone 174 Dec 22, 2022
Source code for Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning

Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning Official implementation of ACC, described in the paper "Adaptively Calibrated C

3 Sep 16, 2022
Locally Most Powerful Bayesian Test for Out-of-Distribution Detection using Deep Generative Models

LMPBT Supplementary code for the Paper entitled ``Locally Most Powerful Bayesian Test for Out-of-Distribution Detection using Deep Generative Models"

1 Sep 29, 2022
UDP++ (ECCVW 2020 Oral), (Winner of COCO 2020 Keypoint Challenge).

UDP-Pose This is the pytorch implementation for UDP++, which won the Fisrt place in COCO Keypoint Challenge at ECCV 2020 Workshop. Top-Down Results on

20 Jul 29, 2022
PyTorch implementation of Neural Combinatorial Optimization with Reinforcement Learning.

neural-combinatorial-rl-pytorch PyTorch implementation of Neural Combinatorial Optimization with Reinforcement Learning. I have implemented the basic

Patrick E. 454 Jan 06, 2023
Live training loss plot in Jupyter Notebook for Keras, PyTorch and others

livelossplot Don't train deep learning models blindfolded! Be impatient and look at each epoch of your training! (RECENT CHANGES, EXAMPLES IN COLAB, A

Piotr Migdał 1.2k Jan 08, 2023
An architecture that makes any doodle realistic, in any specified style, using VQGAN, CLIP and some basic embedding arithmetics.

Sketch Simulator An architecture that makes any doodle realistic, in any specified style, using VQGAN, CLIP and some basic embedding arithmetics. See

12 Dec 18, 2022
Whisper is a file-based time-series database format for Graphite.

Whisper Overview Whisper is one of three components within the Graphite project: Graphite-Web, a Django-based web application that renders graphs and

Graphite Project 1.2k Dec 25, 2022
Contains code for the paper "Vision Transformers are Robust Learners".

Vision Transformers are Robust Learners This repository contains the code for the paper Vision Transformers are Robust Learners by Sayak Paul* and Pin

Sayak Paul 103 Jan 05, 2023
Hub is a dataset format with a simple API for creating, storing, and collaborating on AI datasets of any size.

Hub is a dataset format with a simple API for creating, storing, and collaborating on AI datasets of any size. The hub data layout enables rapid transformations and streaming of data while training m

Activeloop 5.1k Jan 08, 2023
Official codebase for Pretrained Transformers as Universal Computation Engines.

universal-computation Overview Official codebase for Pretrained Transformers as Universal Computation Engines. Contains demo notebook and scripts to r

Kevin Lu 210 Dec 28, 2022
Newt - a Gaussian process library in JAX.

Newt __ \/_ (' \`\ _\, \ \\/ /`\/\ \\ \ \\

AaltoML 0 Nov 02, 2021
Learning Versatile Neural Architectures by Propagating Network Codes

Learning Versatile Neural Architectures by Propagating Network Codes Mingyu Ding, Yuqi Huo, Haoyu Lu, Linjie Yang, Zhe Wang, Zhiwu Lu, Jingdong Wang,

Mingyu Ding 36 Dec 06, 2022
Repository for "Improving evidential deep learning via multi-task learning," published in AAAI2022

Improving evidential deep learning via multi task learning It is a repository of AAAI2022 paper, “Improving evidential deep learning via multi-task le

deargen 11 Nov 19, 2022
The open-source and free to use Python package miseval was developed to establish a standardized medical image segmentation evaluation procedure

miseval: a metric library for Medical Image Segmentation EVALuation The open-source and free to use Python package miseval was developed to establish

59 Dec 10, 2022
Image Fusion Transformer

Image-Fusion-Transformer Platform Python 3.7 Pytorch =1.0 Training Dataset MS-COCO 2014 (T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ram

Vibashan VS 68 Dec 23, 2022
Machine learning notebooks in different subjects optimized to run in google collaboratory

Notebooks Name Description Category Link Training pix2pix This notebook shows a simple pipeline for training pix2pix on a simple dataset. Most of the

Zaid Alyafeai 363 Dec 06, 2022
Code for "FGR: Frustum-Aware Geometric Reasoning for Weakly Supervised 3D Vehicle Detection", ICRA 2021

FGR This repository contains the python implementation for paper "FGR: Frustum-Aware Geometric Reasoning for Weakly Supervised 3D Vehicle Detection"(I

Yi Wei 31 Dec 08, 2022
Dialect classification

Dialect-Classification This repository presents the data that was used in a talk at ICKL-5 (5th International Conference on Kurdish Linguistics) at th

Kurdish-BLARK 0 Nov 12, 2021