code for modular summarization work published in ACL2021 by Krishna et al

Last update: Nov 24, 2022

Related tags

Overview

This repository contains the code for running modular summarization pipelines as described in the publication
Krishna K, Khosla K, Bigham J, Lipton ZC. Generating SOAP Notes from Doctor-Patient Conversations." ACL 2021.

Instructions

Although we can not release models trained on the confidential medical data, we have released models trained on the publicly available AMI dataset.
To reproduce the results on the AMI dataset, you need to follow the steps listed below. For convenience, we have also created a Google Colab notebook here that runs these steps on Google's servers (free-of-cost as of June 2021) and produces the summaries and their rouge scores.

Step1: Set up the environment by installing the required packages mentioned in requirements.txt using pip.

Step2: Download the ami_models folder from this link and put it at the root of the repository:

Step3: Run the following 3 commands to prepare data, run summary generation pipelines, and show the achieved rouge scores.

# command1: downloads and preprocesses AMI dataset  
./prepare_data.sh  
  
 # command2: runs the summarization pipelines on the data and computes rouge scores  
 # (before running this command, you need to download the models as shown above)  
./predict_ami.sh  
  
# command3: print the results  
python show_results.py

code for modular summarization work published in ACL2021 by Krishna et al

Related tags

Overview

Instructions

Owner

Approximately Correct Machine Intelligence (ACMI) Lab

A modular Karton Framework service that unpacks common packers like UPX and others using the Qiling Framework.

This is a MD5 password/passphrase brute force tool

A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN

Analyse japanese ebooks using MeCab to determine the difficulty level for japanese learners

DomainWordsDict, Chinese words dict that contains more than 68 domains, which can be used as text classification、knowledge enhance task

Ukrainian TTS (text-to-speech) using Coqui TTS

jiant is an NLP toolkit

HuggingSound: A toolkit for speech-related tasks based on HuggingFace's tools

This codebase facilitates fast experimentation of differentially private training of Hugging Face transformers.

Princeton NLP's pre-training library based on fairseq with DeepSpeed kernel integration 🚃

Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation (SIGGRAPH Asia 2021)

Diaformer: Automatic Diagnosis via Symptoms Sequence Generation

Text Analysis & Topic Extraction on Android App user reviews

This is a GUI program that will generate a word search puzzle image

Multilingual text (NLP) processing toolkit

DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

Summarization module based on KoBART

Pipelines de datos, 2021.

scikit-learn wrappers for Python fastText.

Large-scale Knowledge Graph Construction with Prompting