This code is for eCaReNet: explainable Cancer Relapse Prediction Network.

Related tags

Deep Learningecarenet
Overview

eCaReNet

This code is for eCaReNet: explainable Cancer Relapse Prediction Network. (Towards Explainable End-to-End Prostate Cancer Relapse Prediction from H&E Images Combining Self-Attention MultipleInstance Learning with a Recurrent Neural Network, Dietrich, E., Fuhlert, P., Ernst, A., Sauter, G., Lennartz, M., Stiehl, H. S., Zimmermann, M., Bonn, S. - ML4H 2021)

eCaReNet takes histopathology images (TMA spots) as input and predicts a survival curve and a risk score for individual patients. The network consists of an optional self-attention layer, an RNN and an attention based Mulitple Instance Learning module for explainability. To increase model performance, we suggest to include a binary prediction of a relapse as input to the model. alt text

TL;DR

  • store your dataset information in a .csv file

  • make your own my_config.yaml, following the example in config.yaml

  • run $ python train_model.py with my_config.yaml

Requirements and Installation

  • Python and Tensorflow
  • npm install -g omniboard to view results in browser and inspect experiments

Data preprocessing

All annotations of your images need to be stored in a .csv file with the image path and annotations as columns. You need separate csv files for your training, validation and test sets. Here is an example:

img_path censored relapse_time survived_2years ISUP_score
img1.png 0 80.3 1 3

The columns can be named as you wish, you need to tell python which columns to use in the config file ↓

config file

The config file (config.yaml) is needed to define the directories where the images and training, validation and test .csv files are stored. Further, you can choose whether to train a classification (for M_ISUP or M_Bin) or the survival model eCaReNet, which loss function and optimizer to use. Also the preprocessing is defined here (patching, resizing, ...) Details are found in config.yaml. It is best to create a custom my_config.yaml file and run the code as

$ python train_model.py with my_config.yaml 

You can also change single parameters in the command line like

$ python train_model.py with my_config.yaml general.seed="13" 

Training procedure

As recommended in our paper, we suggest to first train M_ISUP with config_isup.yaml. So train your base network (Inception or other) on a classification task as a transfer learning task. Second, train a binary classifier with config_bin.yaml, choose an appropriate time point to base the decision on. Here, you need to load the pretrained model from step one, do not load Inception or other keras models. For the third step the prediction from model two M_Bin are needed, so please store the information in the .csv file. Then again, load model from step one, and this time include the predictions as additional input and train.

Unittests

For most functions, a unittest is given in the test folder. This can be used to test if the function works correctly after adapting it (e.g. add more functionality or speed up). Further, it can be used for debugging to find errors or to find out what the function is actually doing. This is faster than running the whole code.

Docker

In the docker_context folder, the Dockerfile and requirements are stored. To build a docker image run

$ "docker build -t IMAGE_NAME:DATE docker_context"
$ docker build  -t ecarenet_docker:2021_10 docker_context

To use the image, run something like the following. Needs to be adapted to your own paths and resources

$ docker run --gpus=all --cpuset-cpus=5,6 --user `id --user`:`id --group` -it --rm -v /home/UNAME/PycharmProjects/ecarenet:/opt/project -v /PATH/TO/DATA:/data --network NETWORK_NAME--name MY_DOCKER_CONTAINER ecarenet_docker:2021_10 bash

More information on docker can be found here: https://docs.docker.com/get-started/

sacred

We use sacred (https://sacred.readthedocs.io/en/stable/) and a MongoDB (https://sacred.readthedocs.io/en/stable/observers.html https://www.mongodb.com/) to store all experiments. For each training run, a folder with an increasing id will be created automatically and all information about the run will be stored in that folder, like resulting weights, plots, metrics and losses. Which folder this is, is written in settings/default_settings and in the config in training.model_save_path.

The code works without mongodb, if you want. All results will be stored. If you do want to use the mongodb, you need to run a docker container with a mongoDB:

$ docker run -d -p PORTNUMBER:27017 -v ./my_data_folder:/data/db --name name_of_my_mongodb mongo

Create a network:

$ docker network create NETWORK_NAME

Attach container to network

$ docker network connect NETWORK_NAME name_of_my_mongodb

Then during training, the --network NETWORK_NAME property needs to be set. Use omniboard to inspect results:

$ omniboard -m localhost:PORTNUMBER:sacred

Tensorflow

Using tensorflow tf.data speeds up the data generation and preprocessing steps. If the dataset is very large, it can be cached with tf.cache() either in memory or to a file with tf.cache('/path/to/folder/plus/filename') [in dataset_creation/dataset_main.py]. Using tensorflow, it is also best to not overly use numpy functions or to decorate them with tf.function. Functions decorated with @tf.function will be included in the tensorflow graph in the first step and not be created again and again. For debugging, you need to remove the tf.function decorator, because otherwise the function (and breakpoints inside) will be skipped.

Owner
Institute of Medical Systems Biology
Institute of Medical Systems Biology
How to train a CNN to 99% accuracy on MNIST in less than a second on a laptop

Training a NN to 99% accuracy on MNIST in 0.76 seconds A quick study on how fast you can reach 99% accuracy on MNIST with a single laptop. Our answer

Tuomas Oikarinen 42 Dec 10, 2022
Google Landmark Recogntion and Retrieval 2021 Solutions

Google Landmark Recogntion and Retrieval 2021 Solutions In this repository you can find solution and code for Google Landmark Recognition 2021 and Goo

Vadim Timakin 5 Nov 25, 2022
FasterAI: A library to make smaller and faster models with FastAI.

Fasterai fasterai is a library created to make neural network smaller and faster. It essentially relies on common compression techniques for networks

Nathan Hubens 193 Jan 01, 2023
Pytorch implementation of MaskGIT: Masked Generative Image Transformer

Pytorch implementation of MaskGIT: Masked Generative Image Transformer

Dominic Rampas 247 Dec 16, 2022
Efficient Speech Processing Tookit for Automatic Speaker Recognition

Sugar Efficient Speech Processing Tookit for Automatic Speaker Recognition | HuggingFace | What's New EfficientTDNN: Efficient Architecture Search for

WangRui 14 Sep 14, 2022
Multi-Task Deep Neural Networks for Natural Language Understanding

New Release We released Adversarial training for both LM pre-training/finetuning and f-divergence. Large-scale Adversarial training for LMs: ALUM code

Xiaodong 2.1k Dec 30, 2022
Semantic similarity computation with different state-of-the-art metrics

Semantic similarity computation with different state-of-the-art metrics Description • Installation • Usage • License Description TaxoSS is a semantic

6 Jun 22, 2022
Spatial color quantization in Rust

rscolorq Rust port of Derrick Coetzee's scolorq, based on the 1998 paper "On spatial quantization of color images" by Jan Puzicha, Markus Held, Jens K

Collyn O'Kane 37 Dec 22, 2022
Scalable machine learning based time series forecasting

mlforecast Scalable machine learning based time series forecasting. Install PyPI pip install mlforecast Optional dependencies If you want more functio

Nixtla 145 Dec 24, 2022
lightweight python wrapper for vowpal wabbit

vowpal_porpoise Lightweight python wrapper for vowpal_wabbit. Why: Scalable, blazingly fast machine learning. Install Install vowpal_wabbit. Clone and

Joseph Reisinger 163 Nov 24, 2022
Code for "Solving Graph-based Public Good Games with Tree Search and Imitation Learning"

Code for "Solving Graph-based Public Good Games with Tree Search and Imitation Learning" This is the code for the paper Solving Graph-based Public Goo

Victor-Alexandru Darvariu 3 Dec 05, 2022
A Nim frontend for pytorch, aiming to be mostly auto-generated and internally using ATen.

Master Release Pytorch - Py + Nim A Nim frontend for pytorch, aiming to be mostly auto-generated and internally using ATen. Because Nim compiles to C+

Giovanni Petrantoni 425 Dec 22, 2022
DilatedNet in Keras for image segmentation

Keras implementation of DilatedNet for semantic segmentation A native Keras implementation of semantic segmentation according to Multi-Scale Context A

303 Mar 15, 2022
YOLOv5 🚀 is a family of object detection architectures and models pretrained on the COCO dataset

YOLOv5 🚀 is a family of object detection architectures and models pretrained on the COCO dataset, and represents Ultralytics open-source research int

阿才 73 Dec 16, 2022
Bling's Object detection tool

BriVL for Building Applications This repo is used for illustrating how to build applications by using BriVL model. This repo is re-implemented from fo

chuhaojin 47 Nov 01, 2022
Codebase for "ProtoAttend: Attention-Based Prototypical Learning."

Codebase for "ProtoAttend: Attention-Based Prototypical Learning." Authors: Sercan O. Arik and Tomas Pfister Paper: Sercan O. Arik and Tomas Pfister,

47 2 May 17, 2022
source code the paper Fast and Robust Iterative Closet Point.

Fast-Robust-ICP This repository includes the source code the paper Fast and Robust Iterative Closet Point. Authors: Juyong Zhang, Yuxin Yao, Bailin De

yaoyuxin 320 Dec 28, 2022
Boost learning for GNNs from the graph structure under challenging heterophily settings. (NeurIPS'20)

Beyond Homophily in Graph Neural Networks: Current Limitations and Effective Designs Jiong Zhu, Yujun Yan, Lingxiao Zhao, Mark Heimann, Leman Akoglu,

GEMS Lab: Graph Exploration & Mining at Scale, University of Michigan 70 Dec 18, 2022
PED: DETR for Crowd Pedestrian Detection

PED: DETR for Crowd Pedestrian Detection Code for PED: DETR For (Crowd) Pedestrian Detection Paper PED: DETR for Crowd Pedestrian Detection Installati

36 Sep 13, 2022
Monocular 3D pose estimation. OpenVINO. CPU inference or iGPU (OpenCL) inference.

human-pose-estimation-3d-python-cpp RealSenseD435 (RGB) 480x640 + CPU Corei9 45 FPS (Depth is not used) 1. Run 1-1. RealSenseD435 (RGB) 480x640 + CPU

Katsuya Hyodo 8 Oct 03, 2022