Interpretable-contrastive-word-mover-s-embedding

Paper Datasets

Here is a Dropbox link to the datasets used in the paper: https://www.dropbox.com/sh/nf532hddgdt68ix/AABGLUiPRyXv6UL2YAcHmAFqa?dl=0 The dataset in the above link was provided in .mat file. You may need to transform to the .npy file to run our code. Each mat file contains following component
X is a cell array of all documents, each represented by a dxm matrix where d is the dimensionality of the word embedding and m is the number of unique words in the document. ("BBCsports.npy")
Y is an array of labels ("BBCsports_grade.npy")
BOW_X is a cell array of word counts for each document('weight.npy')
indices is a cell array of global unique IDs for words in a document
TR is a matrix whose ith row is the ith training split of document indices('index_tr.npy')
TE is a matrix whose ith row is the ith testing split of document indices('index_te.npy')
'BBCsports_length.npy' is the number of unique words for each sample.

Demo

In the demo code we use BBCsports data set. The data is preprocessed and has been saved as .npy file can be found in the following link: https://drive.google.com/drive/folders/1GuQsHS1J8J24GnCmTCTDPH5hWWYtmw4s?usp=sharing
Please put the data into the same path as 2 python files.
Use

python run_pos.py

to run the file.

Citation

If you find this repo useful for your research, please consider citing the paper

@misc{jiang2021interpretable,
    title={Interpretable contrastive word mover's embedding},
    author={Ruijie Jiang and Julia Gouvea and Eric Miller and David Hammer and Shuchin Aeron},
    year={2021},
    eprint={2111.01023},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

Any question please feel free to contact Ruijie Jiang ([email protected]).

Interpretable-contrastive-word-mover-s-embedding

Related tags

Overview

Interpretable-contrastive-word-mover-s-embedding

Paper Datasets

Demo

Citation

Owner

An open source bike computer based on Raspberry Pi Zero (W, WH) with GPS and ANT+. Including offline map and navigation.

This is the official source code for SLATE. We provide the code for the model, the training code, and a dataset loader for the 3D Shapes dataset. This code is implemented in Pytorch.

Open source simulator for autonomous vehicles built on Unreal Engine / Unity, from Microsoft AI & Research

GLM (General Language Model)

End-to-End Object Detection with Fully Convolutional Network

Unified learning approach for egocentric hand gesture recognition and fingertip detection

VQMIVC - Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion

基于PaddleOCR搭建的OCR server... 离线部署用

The implementation of our CIKM 2021 paper titled as: "Cross-Market Product Recommendation"

SelfAugment extends MoCo to include automatic unsupervised augmentation selection.

Contrastive Multi-View Representation Learning on Graphs

Official implementation of "Intrinsic Dimension, Persistent Homology and Generalization in Neural Networks", NeurIPS 2021.

This code is a toolbox that uses Torch library for training and evaluating the ERFNet architecture for semantic segmentation.

Source code and notebooks to reproduce experiments and benchmarks on Bias Faces in the Wild (BFW).

The repository includes the code for training cell counting applications. (Keras + Tensorflow)

CMT: Convolutional Neural Networks Meet Vision Transformers

Kroomsa: A search engine for the curious

TorchFlare is a simple, beginner-friendly, and easy-to-use PyTorch Framework train your models effortlessly.

Class-Attentive Diffusion Network for Semi-Supervised Classification [AAAI'21] (official implementation)

Transformer Huffman coding - Complete Huffman coding through transformer