The source code of "Language Models are Few-shot Multilingual Learners" (MRL @ EMNLP 2021)

Last update: Nov 21, 2022

Overview

Language Models are Few-shot Multilingual Learners

Paper

This is the source code of the paper [Arxiv] [ACL Anthology]:

This code has been written using PyTorch. If you use source codes or datasets included in this toolkit in your work, please cite the following paper:

@inproceedings{winata-etal-2021-language,
    title = "Language Models are Few-shot Multilingual Learners",
    author = "Winata, Genta Indra  and
      Madotto, Andrea  and
      Lin, Zhaojiang  and
      Liu, Rosanne  and
      Yosinski, Jason  and
      Fung, Pascale",
    booktitle = "Proceedings of the 1st Workshop on Multilingual Representation Learning",
    month = nov,
    year = "2021",
    address = "Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.mrl-1.1",
    pages = "1--15",
}

Setup Environment

GPU Machine

pip install -r requirements.txt

GPU Machine for Running GPT-J 6B Model

apt install zstd

# the "slim" version contain only bf16 weights and no optimizer parameters, which minimizes bandwidth and memory
wget -c https://the-eye.eu/public/AI/GPT-J-6B/step_383500_slim.tar.zstd

tar -I zstd -xf step_383500_slim.tar.zstd

pip install -r mesh_transformer_jax/requirements.txt

# jax 0.2.12 is required due to a regression with xmap in 0.2.13
pip install mesh-transformer-jax/ jax==0.2.12

# cuda[your_cuda_version]
pip install jaxlib==0.1.67+cuda101 -f https://storage.googleapis.com/jax-releases/jax_releases.html

How to run

Zero-shot Cross-task

❱❱❱ CUDA_VISIBLE_DEVICES=0 python evaluate.py  --dataset snips --model_checkpoint facebook/bart-large-mnli --cuda --length 5 --label_type value --src_lang en --tgt_lang en --seed 42 --use_log_prob --use_confidence --is_cross_task

Finetune

❱❱❱ CUDA_VISIBLE_DEVICES=0 python finetune.py  --dataset snips --model_checkpoint bert-base-multilingual-uncased --cuda --label_type value --src_lang en --tgt_lang en --seed 42

The source code of "Language Models are Few-shot Multilingual Learners" (MRL @ EMNLP 2021)

Related tags

Overview

Language Models are Few-shot Multilingual Learners

Paper

Setup Environment

GPU Machine

GPU Machine for Running GPT-J 6B Model

How to run

Zero-shot Cross-task

Finetune

Owner

Genta Indra Winata

This repository details the steps in creating a Part of Speech tagger using Trigram Hidden Markov Models and the Viterbi Algorithm without using external libraries.

PyTorch implementation of NATSpeech: A Non-Autoregressive Text-to-Speech Framework

jel - Japanese Entity Linker - is Bi-encoder based entity linker for japanese.

Train and use generative text models in a few lines of code.

API for the GPT-J language model 🦜. Including a FastAPI backend and a streamlit frontend

An end to end ASR Transformer model training repo

CodeBERT: A Pre-Trained Model for Programming and Natural Languages.

Knowledge Management for Humans using Machine Learning & Tags

Automatically search Stack Overflow for the command you want to run

Semantic search for quotes.

MHtyper is an end-to-end pipeline for recognized the Forensic microhaplotypes in Nanopore sequencing data.

precise iris segmentation

TweebankNLP - Pre-trained Tweet NLP Pipeline (NER, tokenization, lemmatization, POS tagging, dependency parsing) + Models + Tweebank-NER

CCF BDCI BERT系统调优赛题baseline（Pytorch版本）

The SVO-Probes Dataset for Verb Understanding

Comprehensive-E2E-TTS - PyTorch Implementation

Semantic search through a vectorized Wikipedia (SentenceBERT) with the Weaviate vector search engine

Download videos from YouTube/Twitch/Twitter right in the Windows Explorer, without installing any shady shareware apps

ElasticBERT: A pre-trained model with multi-exit transformer architecture.

Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch