Video Corpus Moment Retrieval with Contrastive Learning (SIGIR 2021)

Overview

Video Corpus Moment Retrieval with Contrastive Learning

PyTorch implementation for the paper "Video Corpus Moment Retrieval with Contrastive Learning" (SIGIR 2021, long paper): SIGIR version, ArXiv version.

model_overview

The codes are modified from TVRetrieval.

Prerequisites

  • python 3.x with pytorch (1.7.0), torchvision, transformers, tensorboard, tqdm, h5py, easydict
  • cuda, cudnn

If you have Anaconda installed, the conda environment of ReLoCLNet can be built as follows (take python 3.7 as an example):

conda create --name reloclnet python=3.7
conda activate reloclnet
conda install -c anaconda cudatoolkit cudnn  # ignore this if you already have cuda installed
conda install pytorch==1.7.0 torchvision==0.8.0 torchaudio==0.7.0 cudatoolkit=11.0 -c pytorch
conda install -c anaconda h5py=2.9.0
conda install -c conda-forge transformers tensorboard tqdm easydict

The conda environment of TVRetrieval also works.

Getting started

  1. Clone this repository
$ git clone [email protected]:IsaacChanghau/ReLoCLNet.git
$ cd ReLoCLNet
  1. Download features

For the features of TVR dataset, please download tvr_feature_release.tar.gz (link is copied from TVRetrieval#prerequisites) and extract it to the data directory:

$ tar -xf path/to/tvr_feature_release.tar.gz -C data

This link may be useful for you to directly download Google Drive files using wget. Please refer TVRetrieval#prerequisites for more details about how the features are extracted if you are interested.

  1. Add project root to PYTHONPATH (Note that you need to do this each time you start a new session.)
$ source setup.sh

Training and Inference

TVR dataset

# train, refer `method_tvr/scripts/train.sh` and `method_tvr/config.py` more details about hyper-parameters
$ bash method_tvr/scripts/train.sh tvr video_sub_tef resnet_i3d --exp_id reloclnet
# inference
# the model directory placed in method_tvr/results/tvr-video_sub_tef-reloclnet-*
# change the MODEL_DIR_NAME as tvr-video_sub_tef-reloclnet-*
# SPLIT_NAME: [val | test]
$ bash method_tvr/scripts/inference.sh MODEL_DIR_NAME SPLIT_NAME

For more details about evaluation and submission, please refer TVRetrieval#training-and-inference.

Citation

If you feel this project helpful to your research, please cite our work.

@inproceedings{zhang2021video,
	author = {Zhang, Hao and Sun, Aixin and Jing, Wei and Nan, Guoshun and Zhen, Liangli and Zhou, Joey Tianyi and Goh, Rick Siow Mong},
	title = {Video Corpus Moment Retrieval with Contrastive Learning},
	year = {2021},
	isbn = {9781450380379},
	publisher = {Association for Computing Machinery},
	address = {New York, NY, USA},
	url = {https://doi.org/10.1145/3404835.3462874},
	doi = {10.1145/3404835.3462874},
	booktitle = {Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval},
	pages = {685–695},
	numpages = {11},
	location = {Virtual Event, Canada},
	series = {SIGIR '21}
}

TODO

  • Upload codes for ActivityNet Captions dataset
Owner
ZHANG HAO
Research engineer at A*STAR and Ph.D. (CS) candidates at NTU
ZHANG HAO
[NeurIPS 2021] Well-tuned Simple Nets Excel on Tabular Datasets

[NeurIPS 2021] Well-tuned Simple Nets Excel on Tabular Datasets Introduction This repo contains the source code accompanying the paper: Well-tuned Sim

52 Jan 04, 2023
Contrastive Learning for Compact Single Image Dehazing, CVPR2021

AECR-Net Contrastive Learning for Compact Single Image Dehazing, CVPR2021. Official Pytorch based implementation. Paper arxiv Pytorch Version TODO: mo

glassy 253 Jan 01, 2023
Nightmare-Writeup - Writeup for the Nightmare CTF Challenge from 2022 DiceCTF

Nightmare: One Byte to ROP // Alternate Solution TLDR: One byte write, no leak.

1 Feb 17, 2022
Python/Rust implementations and notes from Proofs Arguments and Zero Knowledge

What is this? This is where I'll be collecting resources related to the Study Group on Dr. Justin Thaler's Proofs Arguments And Zero Knowledge Book. T

Thor 66 Jan 04, 2023
Contrastive Feature Loss for Image Prediction

Contrastive Feature Loss for Image Prediction We provide a PyTorch implementation of our contrastive feature loss presented in: Contrastive Feature Lo

Alex Andonian 44 Oct 05, 2022
Y. Zhang, Q. Yao, W. Dai, L. Chen. AutoSF: Searching Scoring Functions for Knowledge Graph Embedding. IEEE International Conference on Data Engineering (ICDE). 2020

AutoSF The code for our paper "AutoSF: Searching Scoring Functions for Knowledge Graph Embedding" and this paper has been accepted by ICDE2020. News:

AutoML Research 64 Dec 17, 2022
Multi-Horizon-Forecasting-for-Limit-Order-Books

Multi-Horizon-Forecasting-for-Limit-Order-Books This jupyter notebook is used to demonstrate our work, Multi-Horizon Forecasting for Limit Order Books

Zihao Zhang 116 Dec 23, 2022
PyTorch Lightning implementation of Automatic Speech Recognition

lasr Lightening Automatic Speech Recognition An MIT License ASR research library, built on PyTorch-Lightning, for developing end-to-end ASR models. In

Soohwan Kim 40 Sep 19, 2022
[ACL 2022] LinkBERT: A Knowledgeable Language Model 😎 Pretrained with Document Links

LinkBERT: A Knowledgeable Language Model Pretrained with Document Links This repo provides the model, code & data of our paper: LinkBERT: Pretraining

Michihiro Yasunaga 264 Jan 01, 2023
Implementation of Hire-MLP: Vision MLP via Hierarchical Rearrangement and An Image Patch is a Wave: Phase-Aware Vision MLP.

Hire-Wave-MLP.pytorch Implementation of Hire-MLP: Vision MLP via Hierarchical Rearrangement and An Image Patch is a Wave: Phase-Aware Vision MLP Resul

Nevermore 29 Oct 28, 2022
A pytorch implementation of Detectron. Both training from scratch and inferring directly from pretrained Detectron weights are available.

Use this instead: https://github.com/facebookresearch/maskrcnn-benchmark A Pytorch Implementation of Detectron Example output of e2e_mask_rcnn-R-101-F

Roy 2.8k Dec 29, 2022
Trading Gym is an open source project for the development of reinforcement learning algorithms in the context of trading.

Trading Gym Trading Gym is an open-source project for the development of reinforcement learning algorithms in the context of trading. It is currently

Dimitry Foures 535 Nov 15, 2022
Deep-learning-roadmap - All You Need to Know About Deep Learning - A kick-starter

Deep Learning - All You Need to Know Sponsorship To support maintaining and upgrading this project, please kindly consider Sponsoring the project deve

Instill AI 4.4k Dec 26, 2022
Deep motion transfer

animation-with-keypoint-mask Paper The right most square is the final result. Softmax mask (circles): \ Heatmap mask: \ conda env create -f environmen

9 Nov 01, 2022
Code for IntraQ, PyTorch implementation of our paper under review

IntraQ: Learning Synthetic Images with Intra-Class Heterogeneity for Zero-Shot Network Quantization paper Requirements Python = 3.7.10 Pytorch == 1.7

1 Nov 19, 2021
Code for Referring Image Segmentation via Cross-Modal Progressive Comprehension, CVPR2020.

CMPC-Refseg Code of our CVPR 2020 paper Referring Image Segmentation via Cross-Modal Progressive Comprehension. Shaofei Huang*, Tianrui Hui*, Si Liu,

spyflying 55 Dec 01, 2022
A PyTorch Image-Classification With AlexNet And ResNet50.

PyTorch 图像分类 依赖库的下载与安装 在终端中执行 pip install -r -requirements.txt 完成项目依赖库的安装 使用方式 数据集的准备 STL10 数据集 下载:STL-10 Dataset 存储位置:将下载后的数据集中 train_X.bin,train_y.b

FYH 4 Feb 22, 2022
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

This is the Vowpal Wabbit fast online learning code. Why Vowpal Wabbit? Vowpal Wabbit is a machine learning system which pushes the frontier of machin

Vowpal Wabbit 8.1k Jan 06, 2023
Speech recognition tool to convert audio to text transcripts, for Linux and Raspberry Pi.

Spchcat Speech recognition tool to convert audio to text transcripts, for Linux and Raspberry Pi. Description spchcat is a command-line tool that read

Pete Warden 279 Jan 03, 2023