PyTorch reimplementation of REALM and ORQA

Last update: Aug 20, 2022

Related tags

Overview

PyTorch Reimplementation of REALM and ORQA

This is PyTorch reimplementation of REALM (paper, codebase) and ORQA (paper, codebase).

Some features have not been implemented yet, currently the predictor and finetuning script are available.

The term retriever and searcher in the code are basically interchangeable, their difference is that retriever is for REALM pretraining, and searcher is for ORQA finetuning.

Prerequisite

cd transformers && pip install -U -e ".[dev]"
pip install -U scann, apache_beam

Data

To download pretrained checkpoints and preprocessed data, please follow the instructions below:

cd data
pip install -U -r requirements.txt
sh download.sh

Finetune (Experimental)

The default finetuning dataset is Natural Question(NQ). To laod your custom dataset, please change the loading function in data.py.

Training:

python run_finetune.py --is_train \
    --model_dir "./" \
    --num_epochs 2 \
    --device cuda

Evaluation:

python run_finetune.py \
    --retriever_pretrained_name "retriever" \
    --checkpoint_pretrained_name "reader" \
    --model_dir "./" \
    --device cuda

Predict

The default checkpoints of retriever and reader are orqa_nq_model_from_realm. To change them, kindly specify --retriever_path and --checkpoint_path.

python predictor.py --question "Who is the pioneer in modern computer science?"

Output: alan mathison turing

License

Apache License 2.0

PyTorch reimplementation of REALM and ORQA

Related tags

Overview

PyTorch Reimplementation of REALM and ORQA

Prerequisite

Data

Finetune (Experimental)

Predict

License

Owner

Li-Huai (Allan) Lin

Scheme for training and applying a label propagation framework

Code for "AutoMTL: A Programming Framework for Automated Multi-Task Learning"

A rough implementation of the paper "A Steering Algorithm for Redirected Walking Using Reinforcement Learning"

MRQy is a quality assurance and checking tool for quantitative assessment of magnetic resonance imaging (MRI) data.

An algorithm that handles large-scale aerial photo co-registration, based on SURF, RANSAC and PyTorch autograd.

Semi-supervised Learning for Sentiment Analysis

A Python Package For System Identification Using NARMAX Models

Compositional Sketch Search

Pytorch implementation of ProjectedGAN

The first dataset of composite images with rationality score indicating whether the object placement in a composite image is reasonable.

GE2340 project source code without credentials.

1st place solution in CCF BDCI 2021 ULSEG challenge

A foreign language learning aid using a neural network to predict probability of translating foreign words

This is an official implementation for "Exploiting Temporal Contexts with Strided Transformer for 3D Human Pose Estimation".

Equivariant GNN for the prediction of atomic multipoles up to quadrupoles.

Gym environments used in the paper: "Developmental Reinforcement Learning of Control Policy of a Quadcopter UAV with Thrust Vectoring Rotors"

Official implementation of the paper Momentum Capsule Networks (MoCapsNet)

Riemannian Convex Potential Maps

Walk with fastai

EfficientMPC - Efficient Model Predictive Control Implementation