Repository for XLM-T, a framework for evaluating multilingual language models on Twitter data

Related tags

Deep Learningxlm-t
Overview

This is the XLM-T repository, which includes data, code and pre-trained multilingual language models for Twitter.

XLM-T - A Multilingual Language Model Toolkit for Twitter

As explained in the reference paper, we make start from XLM-Roberta base and continue pre-training on a large corpus of Twitter in multiple languages. This masked language model, which we named twitter-xlm-roberta-base in the 🤗 Huggingface hub, can be downloaded from here.

Note: This Twitter-specific pretrained LM was pretrained following a similar strategy to its English-only counterpart, which was introduced as part of the TweetEval framework, and available here.

We also provide task-specific models based on the Adapter technique, fine-tuned for cross-lingual sentiment analysis (See #2):

1 - Code

We include code with various functionalities to complement this release. We provide examples for, among others, feature extraction and adapter-based inference with language models in this notebook. Also with examples for training and evaluating language models on multiple tweet classification tasks, compatible with UMSAB (see #2) and TweetEval datasets.

Perform inference with Huggingface's pipelines

Using Huggingface's pipelines, obtaining predictions is as easy as:

from transformers import pipeline
model_path = "cardiffnlp/twitter-xlm-roberta-base-sentiment"
sentiment_task = pipeline("sentiment-analysis", model=model_path, tokenizer=model_path)
sentiment_task("Huggingface es lo mejor! Awesome library 🤗😎")
[{'label': 'Positive', 'score': 0.9343640804290771}]

Fine-tune xlm-t with adapters

You can fine-tune an adapter built on top of your language model of choice by running the src/adapter_finetuning.py script, for example:

python3 src/adapter_finetuning.py --language spanish --model cardfiffnlp/twitter-xlm-roberta-base --seed 1 --lr 0.0001 --max_epochs 20

Notebooks

For quick prototyping, you can direclty use the Colab notebooks we provide below:

Notebook Description Colab Link
01: Playgroud examples Minimal start examples Open In Colab
02: Extract embeddings Extract embeddings from tweets Open In Colab
03: Sentiment prediction Predict sentiment Open In Colab
04: Fine-tuning Fine-tune a model on custom data Open In Colab

2 - UMSAB, the Unified Multilingual Sentiment Analysis Benchmark

As part of our framework, we also release a unified benchmark for cross-lingual sentiment analysis for eight different languages. All datasets are framed as tweet classification with three labels (positive, negative and neutral). The languages included in the benchmark, as well as the datasets they are based on, are: Arabic (SemEval-2017, Rosenthal et al. 2017), English (SemEval-17, Rosenthal et al. 2017), French (Deft-2017, Benamara et al. 2017), German (SB-10K, Cieliebak et al. 2017), Hindi (SAIL 2015, Patra et al. 2015), Italian (Sentipolc-2016, Barbieri et al. 2016), Portuguese (SentiBR, Brum and Nunes, 2017) and Spanish (Intertass 2017, Díaz Galiano et al. 2018). The format for each dataset follows that of TweetEval with one line per tweet and label per line.

UMSAB Results / Leaderboard

The following results (Macro F1 reported) correspond to XLM-R (Conneau et al. 2020) and XLM-Tw, the same model retrained on Twitter as explained in the reference paper. The two settings are monolingual (trained and tested in the same language) and multilingual (considering all languages for training). Check the reference paper for more details on the setting and the metrics.

FT Mono XLM-R Mono XLM-Tw Mono XLM-R Multi XLM-Tw Multi
Arabic 46.0 63.6 67.7 64.3 66.9
English 50.9 68.2 66.9 68.5 70.6
French 54.8 72.0 68.2 70.5 71.2
German 59.6 73.6 76.1 72.8 77.3
Hindi 37.1 36.6 40.3 53.4 56.4
Italian 54.7 71.5 70.9 68.6 69.1
Portuguese 55.1 67.1 76.0 69.8 75.4
Spanish 50.1 65.9 68.5 66.0 67.9
All lang. 51.0 64.8 66.8 66.8 69.4

If you would like to have your results added to the leaderboard you can either submit a pull request or send an email to any of the paper authors with results and the predictions of your model. Please also submit a reference to a paper describing your approach.

Evaluating your system

For evaluating your system according to Macro-F1, you simply need an individual prediction file for each of the languages. The format of the predictions file should be the same as the output examples in the predictions folder (one output label per line as per the original test file) and the files should be named language.txt (e.g. arabic.txt or all.txt if evaluating all languages at once). The predictions included as an example in this repo correspond to xlm-t trained and evaluated on all languages (All lang.).

Example usage

python src/evaluation_script.py

The script takes as input a set of test labels and the predictions from the "predictions" folder by default, but you can set this to suit your needs as optional arguments.

Optional arguments

Three optional arguments can be modified:

--gold_path: Path to gold datasets. Default: ./data/sentiment

--predictions_path: Path to predictions directory. Default: ./predictions/sentiment

--language: Language to evaluate (arabic, english ... or all). Default: all

Evaluation script sample usage from the terminal with parameters:

python src/evaluation_script.py --gold_path ./data/sentiment --predictions_path ./predictions/sentiment --language arabic

(this script would output the results for the Arabic dataset only)

Reference paper

If you use this repository in your research, please use the following bib entry to cite the reference paper.

@inproceedings{barbieri2021xlmtwitter,
  title={{A Multilingual Language Model Toolkit for Twitter}},
  author={Barbieri, Francesco and Espinosa-Anke, Luis and Camacho-Collados, Jose},
  booktitle={arXiv preprint arXiv:2104.12250},
  year={2021}
}

If using UMSAB, please also cite their corresponding datasets.

License

This repository is released open-source but but restrictions may apply to individual datasets (which are derived from existing data) or Twitter (main data source). We refer users to the original licenses accompanying each dataset and Twitter regulations.

Owner
Cardiff NLP
Cardiff NLP
Improving Object Detection by Label Assignment Distillation

Improving Object Detection by Label Assignment Distillation This is the official implementation of the WACV 2022 paper Improving Object Detection by L

Cybercore Co. Ltd 51 Dec 08, 2022
Multi-view 3D reconstruction using neural rendering. Unofficial implementation of UNISURF, VolSDF, NeuS and more.

Volume rendering + 3D implicit surface Showcase What? previous: surface rendering; now: volume rendering previous: NeRF's volume density; now: implici

Jianfei Guo 682 Jan 04, 2023
AsymmetricGAN - Dual Generator Generative Adversarial Networks for Multi-Domain Image-to-Image Translation

AsymmetricGAN for Image-to-Image Translation AsymmetricGAN Framework for Multi-Domain Image-to-Image Translation AsymmetricGAN Framework for Hand Gest

Hao Tang 42 Jan 15, 2022
Text-to-Music Retrieval using Pre-defined/Data-driven Emotion Embeddings

Text2Music Emotion Embedding Text-to-Music Retrieval using Pre-defined/Data-driven Emotion Embeddings Reference Emotion Embedding Spaces for Matching

Minz Won 50 Dec 05, 2022
Inteligência artificial criada para realizar interação social com idosos.

IA SONIA 4.0 A SONIA foi inspirada no assistente mais famoso do mundo e muito bem conhecido JARVIS. Todo mundo algum dia ja sonhou em ter o seu própri

Vinícius Azevedo 2 Oct 21, 2021
NAACL'2021: Factual Probing Is [MASK]: Learning vs. Learning to Recall

OptiPrompt This is the PyTorch implementation of the paper Factual Probing Is [MASK]: Learning vs. Learning to Recall. We propose OptiPrompt, a simple

Princeton Natural Language Processing 150 Dec 20, 2022
This repo contains source code and materials for the TEmporally COherent GAN SIGGRAPH project.

TecoGAN This repository contains source code and materials for the TecoGAN project, i.e. code for a TEmporally COherent GAN for video super-resolution

Nils Thuerey 5.2k Jan 02, 2023
Fair Recommendation in Two-Sided Platforms

Fair Recommendation in Two-Sided Platforms

gourabgggg 1 Nov 10, 2021
Study of human inductive biases in CNNs and Transformers.

Are Convolutional Neural Networks or Transformers more like human vision? This repository contains the code and fine-tuned models of popular Convoluti

Shikhar Tuli 39 Dec 08, 2022
WRENCH: Weak supeRvision bENCHmark

🔧 What is it? Wrench is a benchmark platform containing diverse weak supervision tasks. It also provides a common and easy framework for development

Jieyu Zhang 176 Dec 28, 2022
A Machine Teaching Framework for Scalable Recognition

MEMORABLE This repository contains the source code accompanying our ICCV 2021 paper. A Machine Teaching Framework for Scalable Recognition Pei Wang, N

2 Dec 08, 2021
4D Human Body Capture from Egocentric Video via 3D Scene Grounding

4D Human Body Capture from Egocentric Video via 3D Scene Grounding [Project] [Paper] Installation: Our method requires the same dependencies as SMPLif

Miao Liu 37 Nov 08, 2022
Epidemiology analysis package

zEpid zEpid is an epidemiology analysis package, providing easy to use tools for epidemiologists coding in Python 3.5+. The purpose of this library is

Paul Zivich 111 Jan 08, 2023
Official implementations of PSENet, PAN and PAN++.

News (2021/11/03) Paddle implementation of PAN, see Paddle-PANet. Thanks @simplify23. (2021/04/08) PSENet and PAN are included in MMOCR. Introduction

395 Dec 14, 2022
A small tool to joint picture including gif

README 做设计的时候遇到拼接长图的情况,但是发现没有什么好用的能拼接gif的工具。 于是自己写了个gif拼接小工具。 可以自动拼接gif、png和jpg等常见格式。 效果 从上至下 从下至上 从左至右 从右至左 使用 克隆仓库 git clone https://github.com/Dels

3 Dec 15, 2021
Robust & Reliable Route Recommendation on Road Networks

NeuroMLR: Robust & Reliable Route Recommendation on Road Networks This repository is the official implementation of NeuroMLR: Robust & Reliable Route

4 Dec 20, 2022
Analysis of Antarctica sequencing samples contaminated with SARS-CoV-2

Analysis of SARS-CoV-2 reads in sequencing of 2018-2019 Antarctica samples in PRJNA692319 The samples analyzed here are described in this preprint, wh

Jesse Bloom 4 Feb 09, 2022
PyTorch implementation of MulMON

MulMON This repository contains a PyTorch implementation of the paper: Learning Object-Centric Representations of Multi-object Scenes from Multiple Vi

NanboLi 16 Nov 03, 2022
torchbearer: A model fitting library for PyTorch

Note: We're moving to PyTorch Lightning! Read about the move here. From the end of February, torchbearer will no longer be actively maintained. We'll

631 Jan 04, 2023
RepVGG: Making VGG-style ConvNets Great Again

This repository is the code that needs to be submitted for OpenMMLab Algorithm Ecological Challenge,the paper is RepVGG: Making VGG-style ConvNets Great Again

Ty Feng 62 May 21, 2022