Automatic Idiomatic Expression Detection

Related tags

Deep LearningDISC
Overview

IDentifier of Idiomatic Expressions via Semantic Compatibility (DISC)

An Idiomatic identifier that detects the presence and span of idiomatic expression in a given sentence.

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. License
  5. Contact
  6. Acknowledgements

About The Project

This project is a supervised idiomatic expression identification method. Given a sentence that contains a potentially idiomatic expression (PIE), the model identifies the span of the PIE if it is indeed used in an idiomatic sense, otherwise, the model does not identify the PIE. The identification is done via checking the smemantic compatibility. More details will be updated here (Detail description, figures, etc.).

The paper will appear in TACL.

Built With

This model is heavily relying the resources/libraries list as following:

Getting Started

The implementation here includes processed data created for MAGPIE random-split dataset. The model checkpoint that trained with MAGPIE random-split is also provided.

Prerequisites

All the dependencies for this project is listed in requirements.txt. You can install them via a standard command:

pip install -r requirements.txt

It is highly recommanded to start a conda environment with PyTorch properly installed based on your hardward before install the other requirements.

Checkpoint

To run the model with a pre-trained checkpoint, please first create a ./checkpoints folder at root. Then, please download the checkpoint from Google Drive via this Link. Please put the checkpoint in the ./checkpoints folder.

Usage

Configuration

Before running the demo or experiments (training or testing), please see the config.py which sets the configuration of the model. Some parameters there, such as MODE needs to be set appropriately for the model to run correctly. Please see comments for more details.

Demo

To start, please go through the examples provided in demo.ipynb. In there, we process a given input sentence into the model input data and then run model inference to extract the idiomatic expression (if present) from the input sentence (visualized).

Data processing

To process a dataset (such as MAGPIE) for model training and testing, please refer to ./data_processing/MAGPIE/read_comp_data_processing.ipynb. It takes a dataset with sententences and their PIE lcoations as input and generate all the necessary files for model training and inference.

Training and Testing

For training and testing, please refer to train.py and test.py. Note that test.py is used to produce evaluation scores as shown in the paper. inference.py is used to produce prediction for sentences.

License

Distributed under the MIT License. See LICENSE for more information.

Contact

Ziheng Zeng - [email protected]

Project Link: https://github.com/your_username/repo_name

Acknowledgements

[TODO]:

Add the following in README:

  • Method detail descrption
  • Method figure
  • Demo walkthrough
  • Data processing tips and instructions Add requirements.txt
Tools for investing in Python

InvestOps Original repository on GitHub Original author is Magnus Erik Hvass Pedersen Introduction This is a Python package with simple and effective

24 Nov 26, 2022
Learning to Identify Top Elo Ratings with A Dueling Bandits Approach

Learning to Identify Top Elo Ratings We propose two algorithms MaxIn-Elo and MaxIn-mElo to solve the top players identification on the transitive and

2 Jan 14, 2022
Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"

Mask2Former: Masked-attention Mask Transformer for Universal Image Segmentation Bowen Cheng, Ishan Misra, Alexander G. Schwing, Alexander Kirillov, Ro

Meta Research 1.2k Jan 02, 2023
Official implementation of "Variable-Rate Deep Image Compression through Spatially-Adaptive Feature Transform", ICCV 2021

Variable-Rate Deep Image Compression through Spatially-Adaptive Feature Transform This repository is the implementation of "Variable-Rate Deep Image C

Myungseo Song 47 Dec 13, 2022
Non-Homogeneous Poisson Process Intensity Modeling and Estimation using Measure Transport

Non-Homogeneous Poisson Process Intensity Modeling and Estimation using Measure Transport This GitHub page provides code for reproducing the results i

Andrew Zammit Mangion 1 Nov 08, 2021
In Search of Probeable Generalization Measures

In Search of Probeable Generalization Measures Exciting News! In Search of Probeable Generalization Measures has been accepted to the International Co

Mahdi S. Hosseini 6 Sep 11, 2022
Pytorch implementation for "Density-aware Chamfer Distance as a Comprehensive Metric for Point Cloud Completion" (NeurIPS 2021)

Density-aware Chamfer Distance This repository contains the official PyTorch implementation of our paper: Density-aware Chamfer Distance as a Comprehe

Tong WU 93 Dec 15, 2022
ESTDepth: Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks (CVPR 2021)

ESTDepth: Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks (CVPR 2021) Project Page | Video | Paper | Data We present a novel metho

65 Nov 28, 2022
The "breathing k-means" algorithm with datasets and example notebooks

The Breathing K-Means Algorithm (with examples) The Breathing K-Means is an approximation algorithm for the k-means problem that (on average) is bette

Bernd Fritzke 75 Nov 17, 2022
Code for the paper 'A High Performance CRF Model for Clothes Parsing'.

Clothes Parsing Overview This code provides an implementation of the research paper: A High Performance CRF Model for Clothes Parsing Edgar Simo-S

Edgar Simo-Serra 119 Nov 21, 2022
Python 3 module to print out long strings of text with intervals of time inbetween

Python-Fastprint Python 3 module to print out long strings of text with intervals of time inbetween Install: pip install fastprint Sync Usage: from fa

Kainoa Kanter 2 Jun 27, 2022
Pytorch implementation for the EMNLP 2020 (Findings) paper: Connecting the Dots: A Knowledgeable Path Generator for Commonsense Question Answering

Path-Generator-QA This is a Pytorch implementation for the EMNLP 2020 (Findings) paper: Connecting the Dots: A Knowledgeable Path Generator for Common

Peifeng Wang 33 Dec 05, 2022
Happywhale - Whale and Dolphin Identification Silver🥈 Solution (26/1588)

Kaggle-Happywhale Happywhale - Whale and Dolphin Identification Silver 🥈 Solution (26/1588) 竞赛方案思路 图像数据预处理-标志性特征图片裁剪:首先根据开源的标注数据训练YOLOv5x6目标检测模型,将训练集

Franxx 20 Nov 14, 2022
A new GCN model for Point Cloud Analyse

Pytorch Implementation of PointNet and PointNet++ This repo is implementation for VA-GCN in pytorch. Classification (ModelNet10/40) Data Preparation D

12 Feb 02, 2022
Traductor de lengua de señas al español basado en Python con Opencv y MedaiPipe

Traductor de señas Traductor de lengua de señas al español basado en Python con Opencv y MedaiPipe Requerimientos 🔧 Python 3.8 o inferior para evitar

Jahaziel Hernandez Hoyos 3 Nov 12, 2022
Little Ball of Fur - A graph sampling extension library for NetworKit and NetworkX (CIKM 2020)

Little Ball of Fur is a graph sampling extension library for Python. Please look at the Documentation, relevant Paper, Promo video and External Resour

Benedek Rozemberczki 619 Dec 14, 2022
Ludwig is a toolbox that allows to train and evaluate deep learning models without the need to write code.

Translated in 🇰🇷 Korean/ Ludwig is a toolbox that allows users to train and test deep learning models without the need to write code. It is built on

Ludwig 8.7k Jan 05, 2023
Codes for our paper The Stem Cell Hypothesis: Dilemma behind Multi-Task Learning with Transformer Encoders published to EMNLP 2021.

The Stem Cell Hypothesis Codes for our paper The Stem Cell Hypothesis: Dilemma behind Multi-Task Learning with Transformer Encoders published to EMNLP

Emory NLP 5 Jul 08, 2022
Tensorflow 2.x based implementation of EDSR, WDSR and SRGAN for single image super-resolution

Single Image Super-Resolution with EDSR, WDSR and SRGAN A Tensorflow 2.x based implementation of Enhanced Deep Residual Networks for Single Image Supe

Martin Krasser 1.3k Jan 06, 2023
TRIQ implementation

TRIQ Implementation TF-Keras implementation of TRIQ as described in Transformer for Image Quality Assessment. Installation Clone this repository. Inst

Junyong You 115 Dec 30, 2022