NAACL 2022: MCSE: Multimodal Contrastive Learning of Sentence Embeddings

Last update: Nov 15, 2022

Related tags

Overview

MCSE: Multimodal Contrastive Learning of Sentence Embeddings

This repository contains code and pre-trained models for our NAACL-2022 paper MCSE: Multimodal Contrastive Learning of Sentence Embeddings. If you find this reposity useful, please consider citing our paper.

Contact: Miaoran Zhang ([email protected])

Pre-trained Models & Results

Model	Avg. STS
flickr-mcse-bert-base-uncased [Google Drive]	77.70
flickr-mcse-roberta-base [Google Drive]	78.44
coco-mcse-bert-base-uncased [Google Drive]	77.08
coco-mcse-roberta-base [Google Drive]	78.17

Note: flickr indicates that models are trained on wiki+flickr, and coco indicates that models are trained on wiki+coco.

Quickstart

Setup

Python 3.9.5
Pytorch 1.7.1
Install other packages:

pip install -r requirements.txt

Data Preparation

Please organize the data directory as following:

REPO ROOT
|
|--data    
|  |--wiki1m_for_simcse.txt  
|  |--flickr_random_captions.txt    
|  |--flickr_resnet.hdf5    
|  |--coco_random_captions.txt    
|  |--coco_resnet.hdf5

Wiki1M

wget https://huggingface.co/datasets/princeton-nlp/datasets-for-simcse/resolve/main/wiki1m_for_simcse.txt

Flickr30k & MS-COCO
You can either download the preprocessed data we used:
(annotation sources: flickr30k-entities and coco).

Or preprocess the data by yourself (take Flickr30k as an example):

Download the flickr30k-entities.
Request access to the flickr-images from here. Note that the use of the images much abide by the Flickr Terms of Use.

Run script:

unzip ${path_to_flickr-entities}/annotations.zip

python preprocess/prepare_flickr.py \
    --flickr_entities_dir ${path_to_flickr-entities}  \  
    --flickr_images_dir ${path_to_flickr-images} \
    --output_dir data/
    --batch_size 32

Train & Evaluation

Prepare the senteval datasets for evaluation:

cd SentEval/data/downstream/
bash download_dataset.sh

Run scripts:
```
# For example:  (more examples are given in scripts/.)
sh scripts/run_wiki_flickr.sh
```
Note: In the paper we run experiments with 5 seeds (0,1,2,3,4). You can find the detailed parameter settings in Appendix.

Acknowledgements

The extremely clear and well organized codebase: SimCSE
SentEval toolkit

NAACL 2022: MCSE: Multimodal Contrastive Learning of Sentence Embeddings

Related tags

Overview

MCSE: Multimodal Contrastive Learning of Sentence Embeddings

Pre-trained Models & Results

Quickstart

Setup

Data Preparation

Train & Evaluation

Acknowledgements

Owner

Saarland University Spoken Language Systems Group

MPNet: Masked and Permuted Pre-training for Language Understanding

मराठी भाषा वाचविण्याचा एक प्रयास. इंग्रजी ते मराठीचा शब्दकोश. An attempt to preserve the Marathi language. A lightweight and ad free English to Marathi thesaurus.

The Classical Language Toolkit

Traditional Chinese Text Recognition Dataset: Synthetic Dataset and Labeled Data

Fuzzy String Matching in Python

NumPy String-Indexed is a NumPy extension that allows arrays to be indexed using descriptive string labels

This is the source code of RPG (Reward-Randomized Policy Gradient)

PyABSA - Open & Efficient for Framework for Aspect-based Sentiment Analysis

Multiple implementations for abstractive text summurization , using google colab

Auto_code_complete is a auto word-completetion program which allows you to customize it on your needs

Nested Named Entity Recognition

VoiceFixer VoiceFixer is a framework for general speech restoration.

BERT score for text generation

profile tools for pytorch nn models

Python port of Google's libphonenumber

DeepSpeech - Easy-to-use Speech Toolkit including SOTA ASR pipeline, influential TTS with text frontend and End-to-End Speech Simultaneous Translation.

Simple virtual assistant using pyttsx3 and speech recognition optionally with pywhatkit and pther libraries.

⛵️The official PyTorch implementation for "BERT-of-Theseus: Compressing BERT by Progressive Module Replacing" (EMNLP 2020).

Code for evaluating Japanese pretrained models provided by NTT Ltd.

Code Implementation of "Learning Span-Level Interactions for Aspect Sentiment Triplet Extraction".