[EMNLP 2021] MuVER: Improving First-Stage Entity Retrieval with Multi-View Entity Representations

Overview

MuVER

This repo contains the code and pre-trained model for our EMNLP 2021 paper:
MuVER: Improving First-Stage Entity Retrieval with Multi-View Entity Representations. Xinyin Ma, Yong Jiang, Nguyen Bach, Tao Wang, Zhongqiang Huang, Fei Huang, Weiming Lu

Quick Start

1. Requirements

The requirements for our code are listed in requirements.txt, install the package with the following command:

pip install -r requirements.txt

For huggingface/transformers, we tested it under version 4.1.X and 4.2.X.

2. Download data and model

3. Use the released model to reproduce our results

  • Without View Merging:
export PYTHONPATH='.'  
CUDA_VISIBLE_DEVICES=YOUR_GPU_DEVICES python muver/multi_view/train.py 
    --pretrained_model path_to_model/bert-base 
    --bi_ckpt_path path_to_model/best_zeshel.bin 
    --max_cand_len 40 
    --max_seq_len 128
    --do_test 
    --test_mode test 
    --data_parallel 
    --eval_batch_size 16
    --accumulate_score

Expected Result:

World [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected]
forgotten_realms 0.6208 0.7783 0.8592 0.8983 0.9342 0.9533 0.9633 0.9700
lego 0.4904 0.6714 0.7690 0.8357 0.8791 0.9091 0.9208 0.9249
star_trek 0.4743 0.6130 0.6967 0.7606 0.8159 0.8581 0.8805 0.8919
yugioh 0.3432 0.4861 0.6040 0.7004 0.7596 0.8201 0.8512 0.8672
total 0.4496 0.5970 0.6936 0.7658 0.8187 0.8628 0.8854 0.8969
  • With View Merging:
export PYTHONPATH='.'  
CUDA_VISIBLE_DEVICES=YOUR_GPU_DEVICES python muver/multi_view/train.py 
    --pretrained_model path_to_model/bert-base 
    --bi_ckpt_path path_to_model/best_zeshel.bin 
    --max_cand_len 40 
    --max_seq_len 128 
    --do_test 
    --test_mode test 
    --data_parallel 
    --eval_batch_size 16
    --accumulate_score
    --view_expansion  
    --merge_layers 4  
    --top_k 0.4

Expected result:

World [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected]
forgotten_realms 0.6175 0.7867 0.8733 0.9150 0.9375 0.9600 0.9675 0.9708
lego 0.5046 0.6889 0.7882 0.8449 0.8882 0.9183 0.9324 0.9374
star_trek 0.4810 0.6253 0.7121 0.7783 0.8271 0.8706 0.8935 0.9030
yugioh 0.3444 0.5027 0.6322 0.7300 0.7902 0.8429 0.8690 0.8826
total 0.4541 0.6109 0.7136 0.7864 0.8352 0.8777 0.8988 0.9084

Optional Argument:

  • --data_parallel: whether you want to use multiple gpus.
  • --accumulate_score: accumulate score for each entity. Obtain a higher score but will take much time to inference.
  • --view_expansion: whether you want to merge and expand view.
  • --top_k: top_k pairs are expected to merge in each layer.
  • --merge_layers: the number of layers for merging.
  • --test_mode: If you want to generate candidates for train/dev set, change the test_mode to train or dev, which will generate candidates outputs and save it under the directory where you save the test model.

4. How to train your MuVER

We provice the code to train your MuVER. Train the code with the following command:

export PYTHONPATH='.'  
CUDA_VISIBLE_DEVICES=YOUR_GPU_DEVICES python muver/multi_view/train.py 
    --pretrained_model path_to_model/bert-base 
    --epoch 30 
    --train_batch_size 128 
    --learning_rate 1e-5 
    --do_train --do_eval 
    --data_parallel 
    --name distributed_multi_view

Important: Since constrastive learning relies heavily on a large batch size, as reported in our paper, we use eight v100(16g) to train our model. The hyperparameters for our best model are in logs/zeshel_hyper_param.txt

The code will create a directory runtime_log to save the log, model and the hyperparameter you used. Everytime you trained your model(with or without grid search), it will create a directory under runtime_log/name_in_your_args/start_time, e.g., runtime_log/distributed_multi_view/2021-09-07-15-12-21, to store all the checkpoints, curve for visualization and the training log.

TraSw for FairMOT - A Single-Target Attack example (Attack ID: 19; Screener ID: 24):

TraSw for FairMOT A Single-Target Attack example (Attack ID: 19; Screener ID: 24): Fig.1 Original Fig.2 Attacked By perturbing only two frames in this

Derry Lin 21 Dec 21, 2022
Generalized Matrix Means for Semi-Supervised Learning with Multilayer Graphs

Generalized Matrix Means for Semi-Supervised Learning with Multilayer Graphs MATLAB implementation of the paper: P. Mercado, F. Tudisco, and M. Hein,

Pedro Mercado 6 May 26, 2022
Official code for the paper: Deep Graph Matching under Quadratic Constraint (CVPR 2021)

QC-DGM This is the official PyTorch implementation and models for our CVPR 2021 paper: Deep Graph Matching under Quadratic Constraint. It also contain

Quankai Gao 55 Nov 14, 2022
[ICCV'21] PlaneTR: Structure-Guided Transformers for 3D Plane Recovery

PlaneTR: Structure-Guided Transformers for 3D Plane Recovery This is the official implementation of our ICCV 2021 paper News There maybe some bugs in

73 Nov 30, 2022
Pytorch implementation of forward and inverse Haar Wavelets 2D

Pytorch implementation of forward and inverse Haar Wavelets 2D

Sergei Belousov 9 Oct 30, 2022
Who calls the shots? Rethinking Few-Shot Learning for Audio (WASPAA 2021)

rethink-audio-fsl This repo contains the source code for the paper "Who calls the shots? Rethinking Few-Shot Learning for Audio." (WASPAA 2021) Table

Yu Wang 34 Dec 24, 2022
YuNetのPythonでのONNX、TensorFlow-Lite推論サンプル

YuNet-ONNX-TFLite-Sample YuNetのPythonでのONNX、TensorFlow-Lite推論サンプルです。 TensorFlow-LiteモデルはPINTO0309/PINTO_model_zoo/144_YuNetのものを使用しています。 Requirement Op

KazuhitoTakahashi 8 Nov 17, 2021
StyleGAN2-ADA - Official PyTorch implementation

Need Help? If you’re new to StyleGAN2-ADA and looking to get started, please check out this video series from a course Lia Coleman and I taught in Oct

Derrick Schultz 217 Jan 04, 2023
Advantage Actor Critic (A2C): jax + flax implementation

Advantage Actor Critic (A2C): jax + flax implementation Current version supports only environments with continious action spaces and was tested on muj

Andrey 3 Jan 23, 2022
This repository contains code released by Google Research.

This repository contains code released by Google Research.

Google Research 26.6k Dec 31, 2022
Sample code from the Neural Networks from Scratch book.

Neural Networks from Scratch (NNFS) book code Code from the NNFS book (https://nnfs.io) separated by chapter.

Harrison 172 Dec 31, 2022
ImageNet-CoG is a benchmark for concept generalization. It provides a full evaluation framework for pre-trained visual representations which measure how well they generalize to unseen concepts.

The ImageNet-CoG Benchmark Project Website Paper (arXiv) Code repository for the ImageNet-CoG Benchmark introduced in the paper "Concept Generalizatio

NAVER 23 Oct 09, 2022
Waymo motion prediction challenge 2021: 3rd place solution

Waymo motion prediction challenge 2021: 3rd place solution 📜 Technical report 🗨️ Presentation 🎉 Announcement 🛆Motion Prediction Channel Website 🛆

158 Jan 08, 2023
Using contrastive learning and OpenAI's CLIP to find good embeddings for images with lossy transformations

Creating Robust Representations from Pre-Trained Image Encoders using Contrastive Learning Sriram Ravula, Georgios Smyrnis This is the code for our pr

Sriram Ravula 26 Dec 10, 2022
A simple rest api serving a deep learning model that classifies human gender based on their faces. (vgg16 transfare learning)

this is a simple rest api serving a deep learning model that classifies human gender based on their faces. (vgg16 transfare learning)

crispengari 5 Dec 09, 2021
This is the code for the paper "Jinkai Zheng, Xinchen Liu, Wu Liu, Lingxiao He, Chenggang Yan, Tao Mei: Gait Recognition in the Wild with Dense 3D Representations and A Benchmark. (CVPR 2022)"

Gait3D-Benchmark This is the code for the paper "Jinkai Zheng, Xinchen Liu, Wu Liu, Lingxiao He, Chenggang Yan, Tao Mei: Gait Recognition in the Wild

82 Jan 04, 2023
Pip-package for trajectory benchmarking from "Be your own Benchmark: No-Reference Trajectory Metric on Registered Point Clouds", ECMR'21

Map Metrics for Trajectory Quality Map metrics toolkit provides a set of metrics to quantitatively evaluate trajectory quality via estimating consiste

Mobile Robotics Lab. at Skoltech 31 Oct 28, 2022
Heart Arrhythmia Classification

This program takes and input of an ECG in European Data Format (EDF) and outputs the classification for heartbeats into normal vs different types of arrhythmia . It uses a deep learning model for cla

4 Nov 02, 2022
[ACL 20] Probing Linguistic Features of Sentence-level Representations in Neural Relation Extraction

REval Table of Contents Introduction Overview Requirements Installation Probing Usage Citation License 🎓 Introduction REval is a simple framework for

13 Jan 06, 2023
Tools for investing in Python

InvestOps Original repository on GitHub Original author is Magnus Erik Hvass Pedersen Introduction This is a Python package with simple and effective

24 Nov 26, 2022