PRAnCER is a web platform that enables the rapid annotation of medical terms within clinical notes.

Overview

PRAnCER

PRAnCER (Platform enabling Rapid Annotation for Clinical Entity Recognition) is a web platform that enables the rapid annotation of medical terms within clinical notes. A user can highlight spans of text and quickly map them to concepts in large vocabularies within a single, intuitive platform. Users can use the search and recommendation features to find labels without ever needing to leave the interface. Further, the platform can take in output from existing clinical concept extraction systems as pre-annotations, which users can accept or modify in a single click. These features allow users to focus their time and energy on harder examples instead.

Usage

Installation Instructions

Detailed installation instructions are provided below; PRAnCER can operate on Mac, Windows, and Linux machines.

Linking to UMLS Vocabulary

Use of the platform requires a UMLS license, as it requires several UMLS-derived files to surface recommendations. Please email magrawal (at) mit (dot) edu to request these files, along with your API key so we may confirm. You can sign up here. Surfacing additional information in the UI also requires you enter your UMLS API key in application/utils/constants.py.

Loading in and Exporting Data

To load in data, users directly place any clinical notes as .txt files in the /data folder; an example file is provided. The output of annotation is .json file in the /data folder with the same file prefix as the .txt. To start annotating a note from scratch, a user can just delete the corresponding .json file.

Pre-filled Suggestions

Two options exist for pre-filled suggestions; users specify which they want to use in application/utils/constants.py. The default is "MAP". Option 1 for pre-filled suggestions is "MAP", if users want to preload annotations based on a dictionary of high-precision text to CUI for their domain, e.g. {hypertension: "C0020538"}. A pre-created dictionary will be provided alongside the UMLS files described above. Option 2 for pre-filled suggestions is "CSV", if users want to load in pre-computed pre-annotations (e.g. from their own algorithm, scispacy, cTAKES, MetaMap). Users simply place a CSV of spans and CUIs, with the same prefix as the data .txt file, and our scripts will automatically incorporate those annotations. example.csv in the /data file provides an example.

Installation

The platform requires python3.7, node.js, and several other python and javascript packages. Specific installation instructions for each follow!

Backend requirements

1) First check if python3 is installed.

You can check to see if it is installed:

$ python3 --version

If it is installed, you should see Python 3.7.x

If you need to install it, you can easily do that with a package manager like Homebrew:

$ brew install python3

2) With python3 installed, install necessary python packages.

You can install packages with the python package installer pip:

$ pip3 install flask flask_script flask_migrate flask_bcrypt nltk editdistance requests lxml

Frontend requirements

3) Check to see if npm and node.js are installed:

$ npm -v
$ node -v

If they are, you can skip to Step 4. If not, to install node, first install nvm:

curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.35.1/install.sh | bash

Source: https://github.com/nvm-sh/nvm

Re-start your terminal and confirm nvm installation with:

command -v nvm

Which will return nvm if successful

Then install node version 10.15.1:

$ nvm install 10.15.1

4) Install the node dependencies:

$ cd static
$ npm install --save

For remote server applications, permissions errors may be triggered.
If so, try adding --user to install commands.

Run program

Run the backend

Open one terminal tab to run the backend server:

$ python3 manage.py runserver

If all goes well, you should see * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit) followed by a few more lines in the terminal.

Run the frontend

Open a second terminal tab to run the frontend:

$ cd static
$ npm start

After this, open your browser to http://localhost:3000 and you should see the homepage!

Contact

If you have any questions, please email Monica Agrawal [[email protected]]. Credit belongs to Ariel Levy for the development of this platform.

Based on React-Redux-Flask boilerplate.

Owner
Sontag Lab
Machine learning algorithms and applications to health care.
Sontag Lab
Text Classification Using LSTM

Text classification is the task of assigning a set of predefined categories to free text. Text classifiers can be used to organize, structure, and categorize pretty much anything. For example, new ar

KrishArul26 3 Jan 03, 2023
Modeling cumulative cases of Covid-19 in the US during the Covid 19 Delta wave using Bayesian methods.

Introduction The goal of this analysis is to find a model that fits the observed cumulative cases of COVID-19 in the US, starting in Mid-July 2021 and

Alexander Keeney 1 Jan 05, 2022
Binary LSTM model for text classification

Text Classification The purpose of this repository is to create a neural network model of NLP with deep learning for binary classification of texts re

Nikita Elenberger 1 Mar 11, 2022
This repo stores the codes for topic modeling on palliative care journals.

This repo stores the codes for topic modeling on palliative care journals. Data Preparation You first need to download the journal papers. bash 1_down

3 Dec 20, 2022
CoSENT、STS、SentenceBERT

CoSENT_Pytorch 比Sentence-BERT更有效的句向量方案

102 Dec 07, 2022
XLNet: Generalized Autoregressive Pretraining for Language Understanding

Introduction XLNet is a new unsupervised language representation learning method based on a novel generalized permutation language modeling objective.

Zihang Dai 6k Jan 07, 2023
Exploring dimension-reduced embeddings

sleepwalk Exploring dimension-reduced embeddings This is the code repository. See here for the Sleepwalk web page. License and disclaimer This program

S. Anders's research group at ZMBH 91 Nov 29, 2022
SEJE is a prototype for the paper Learning Text-Image Joint Embedding for Efficient Cross-Modal Retrieval with Deep Feature Engineering.

SEJE is a prototype for the paper Learning Text-Image Joint Embedding for Efficient Cross-Modal Retrieval with Deep Feature Engineering. Contents Inst

0 Oct 21, 2021
A python package for deep multilingual punctuation prediction.

This python library predicts the punctuation of English, Italian, French and German texts. We developed it to restore the punctuation of transcribed spoken language.

Oliver Guhr 27 Dec 22, 2022
A CSRankings-like index for speech researchers

Speech Rankings This project mimics CSRankings to generate an ordered list of researchers in speech/spoken language processing along with their possib

Mutian He 19 Nov 26, 2022
Python-zhuyin - An open source Python library that provides a unified interface for converting between Chinese pinyin and Zhuyin (bopomofo)

Python-zhuyin - An open source Python library that provides a unified interface for converting between Chinese pinyin and Zhuyin (bopomofo)

2 Dec 29, 2022
Neural text generators like the GPT models promise a general-purpose means of manipulating texts.

Boolean Prompting for Neural Text Generators Neural text generators like the GPT models promise a general-purpose means of manipulating texts. These m

Jeffrey M. Binder 20 Jan 09, 2023
Repository to hold code for the cap-bot varient that is being presented at the SIIC Defence Hackathon 2021.

capbot-siic Repository to hold code for the cap-bot varient that is being presented at the SIIC Defence Hackathon 2021. Problem Inspiration A plethora

Aryan Kargwal 19 Feb 17, 2022
SurvTRACE: Transformers for Survival Analysis with Competing Events

⭐ SurvTRACE: Transformers for Survival Analysis with Competing Events This repo provides the implementation of SurvTRACE for survival analysis. It is

Zifeng 13 Oct 06, 2022
Full Spectrum Bioinformatics - a free online text designed to introduce key topics in Bioinformatics using the Python

Full Spectrum Bioinformatics is a free online text designed to introduce key topics in Bioinformatics using the Python programming language. The text is written in interactive Jupyter Notebooks, whic

Jesse Zaneveld 33 Dec 28, 2022
A Lightweight NLP Data Loader for All Deep Learning Frameworks in Python

LineFlow: Framework-Agnostic NLP Data Loader in Python LineFlow is a simple text dataset loader for NLP deep learning tasks. LineFlow was designed to

TofuNLP 177 Jan 04, 2023
LeBenchmark: a reproducible framework for assessing SSL from speech

LeBenchmark: a reproducible framework for assessing SSL from speech

11 Nov 30, 2022
(ACL 2022) The source code for the paper "Towards Abstractive Grounded Summarization of Podcast Transcripts"

Towards Abstractive Grounded Summarization of Podcast Transcripts We provide the source code for the paper "Towards Abstractive Grounded Summarization

10 Jul 01, 2022
a test times augmentation toolkit based on paddle2.0.

Patta Image Test Time Augmentation with Paddle2.0! Input | # input batch of images / / /|\ \ \ # apply

AgentMaker 110 Dec 03, 2022
Just a Basic like Language for Zeno INC

zeno-basic-language Just a Basic like Language for Zeno INC This is written in 100% python. this is basic language like language. so its not for big p

Voidy Devleoper 1 Dec 18, 2021