TFPNER: Exploration on the Named Entity Recognition of Token Fused with Part-of-Speech

Last update: Feb 07, 2022

Related tags

Overview

TFPNER

TFPNER: Exploration on the Named Entity Recognition of Token Fused with Part-of-Speech

Named entity recognition (NER), which aims at identifying real-world entity mentions from texts, is a fundamental task in natural language processing with a wide range of applications. Previous approaches mainly focus on the original pure sentence but the Part of speech (POS) contains rich semantic information and contribute to the success of the Natural Language Processing task. To further improve the performance of the NER task, we proposed the five methods that employed POS tags fused with the original tokens based on the BERT model to achieve the NER task, including concatenating token and POS as one or two sentences, adding POS embedding as one of the embedding elements, model ensemble, and conduct the multi-attention between the token representations and POS representations. In this work, we addressed the CoNLL-2003 and Groningen Meaning Bank (GMB) datasets which can provide both NER tags and POS tags. From our experiments on two datasets, part of the proposed methods can show performance improvement in comparison with the baseline methods.

This is the project I worked with Haoqing Tang, the extraordinary computer scientist in CV & NLP area, during the interesting and memorable Master study period.

TFPNER: Exploration on the Named Entity Recognition of Token Fused with Part-of-Speech

Related tags

Overview

TFPNER

TFPNER: Exploration on the Named Entity Recognition of Token Fused with Part-of-Speech

This is the project I worked with Haoqing Tang, the extraordinary computer scientist in CV & NLP area, during the interesting and memorable Master study period.

Owner

ChatBotProyect - This is an unfinished project about a simple chatbot.

🍊 PAUSE (Positive and Annealed Unlabeled Sentence Embedding), accepted by EMNLP'2021 🌴

This github repo is for Neurips 2021 paper, NORESQA A Framework for Speech Quality Assessment using Non-Matching References.

Library for Russian imprecise rhymes generation

RoNER is a Named Entity Recognition model based on a pre-trained BERT transformer model trained on RONECv2

Convolutional 2D Knowledge Graph Embeddings resources

BPEmb is a collection of pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE) and trained on Wikipedia.

Simple translation demo showcasing our headliner package.

AI Assistant for Building Reliable, High-performing and Fair Multilingual NLP Systems

Easy-to-use CPM for Chinese text generation

The entmax mapping and its loss, a family of sparse softmax alternatives.

Python package for Turkish Language.

An Open-Source Package for Neural Relation Extraction (NRE)

Rich Prosody Diversity Modelling with Phone-level Mixture Density Network

Code-autocomplete, a code completion plugin for Python

Tokenizer - Module python d'analyse syntaxique et de grammaire, tokenization

🐍💯pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.

Code for CodeT5: a new code-aware pre-trained encoder-decoder model.

Machine Learning Course Project, IMDB movie review sentiment analysis by lstm, cnn, and transformer

Text classification on IMDB dataset using Keras and Bi-LSTM network