Behavioral Testing of Clinical NLP Models

This repository contains code for testing the behavior of clinical prediction models based on patient letters. For a detailed description of the testing framework see our paper What Do You See in this Patient? Behavioral Testing of Clinical NLP Models.

Usage

Install requirements: pip install -r requirements.txt

Run main.py, e.g. for diagnosis prediction test on gender, age and ethnicity:

python main.py 
    --test_set_path ./path_to_test_set
    --model_path bvanaken/CORe-clinical-diagnosis-prediction
    --task diagnosis
    --shift_keys gender,age,ethnicity
    --save_dir ./results
    --gpu False

Parameter	Description
test_set_path	Path to original test set file
model_path	Path to model or Huggingface model hub checkpoint
task	Current options: diagnosis, mortality
shift_keys	Which patient characteristics to test. Current options: age, gender, ethnicity, weight, intersectional (gender + ethnicity)
save_dir	Directory to save results, default: "./results"
gpu	Whether to use a gpu during inference or not, default: False

Using Non-Transformer models

The framework currently focuses on testing Transformer-based models. However, it is easy to extend it to any other prediction model. To do so, simply create a new class implementing the Predictor interface and add it to the TASK_MAP in main.py.

Cite

@inproceedings{vanAken2021,
  author    = {Betty van Aken and
               Sebastian Herrmann and
               Alexander Löser},
  title     = {What Do You See in this Patient? Behavioral Testing of Clinical NLP Models},
  booktitle = {Bridging the Gap: From Machine Learning Research to Clinical Practice, 
               Research2Clinics Workshop @ NeurIPS 2021},
  year      = {2021}
}

Behavioral Testing of Clinical NLP Models

Related tags

Overview

Behavioral Testing of Clinical NLP Models

Usage

Using Non-Transformer models

Cite

Owner

Betty van Aken

This repository contains the code, models and datasets discussed in our paper "Few-Shot Question Answering by Pretraining Span Selection"

BERT-based Financial Question Answering System

BERN2: an advanced neural biomedical namedentity recognition and normalization tool

Shellcode antivirus evasion framework

Package for controllable summarization

Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis (SV2TTS)

초성 해석기 based on ko-BART

A library for end-to-end learning of embedding index and retrieval model

Repository for Project Insight: NLP as a Service

fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.

🐍 A hyper-fast Python module for reading/writing JSON data using Rust's serde-json.

Tevatron is a simple and efficient toolkit for training and running dense retrievers with deep language models.

Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning

Convolutional Neural Networks for Sentence Classification

Korea Spell Checker

ADCS - Automatic Defect Classification System (ADCS) for SSMC

This repository contains all the source code that is needed for the project : An Efficient Pipeline For Bloom’s Taxonomy Using Natural Language Processing and Deep Learning

Korean extractive summarization. 2021 AI 텍스트 요약 온라인 해커톤 화성갈끄니까팀 코드

Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration

Twitter Sentiment Analysis using #tag, words and username