Multi-query Video Retrieval

This repository contains the code for the paper:

@misc{wang2022multiquery,
      title={Multi-query Video Retrieval}, 
      author={Zeyu Wang and Yu Wu and Karthik Narasimhan and Olga Russakovsky},
      year={2022},
      eprint={2201.03639},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Data Preparation

Download raw videos for MSR-VTT, MSVD and VATEX, and put them into data/{dataset}/raw_videos folder.
Run the script data/extract_frames.sh to extract frames from raw videos.

The resulting data folder structures like this:

├── data
    ├── msrvtt
        ├── msrvtt_train.json
        ├── msrvtt_test.json
        ├── msrvtt_test_varying_query_sample_1-20.json
        ├── raw_videos
            ├── video0.mp4
            ├── ...
        ├── extracted_frames
            ├── video0.mp4
                ├── 0.jpg
                ├── ...
            ├── ...
    ├── msvd
        ├── ...
    ├── vatex
        ├── ...

For Frozen model, download the pretrained checkpoint provided by the original authors here, and put into record/pretrained folder.

Training

Run command: python train.py -c configs/{config_path}

Evaluation

Run command: python evaluate.py -c configs/{config_path}

Acknowledgements

The structure of this repository is based on https://github.com/victoresque/pytorch-template. Some of the code are adpated from https://github.com/m-bain/frozen-in-time and https://github.com/ArrowLuo/CLIP4Clip.

Multi-query Video Retreival

Related tags

Overview

Multi-query Video Retrieval

Data Preparation

Training

Evaluation

Acknowledgements

Owner

Princeton Visual AI Lab

Video2x - A lossless video/GIF/image upscaler achieved with waifu2x, Anime4K, SRMD and RealSR.

A large-scale benchmark for co-optimizing the design and control of soft robots, as seen in NeurIPS 2021.

Wide Residual Networks (WideResNets) in PyTorch

A 3D Dense mapping backend library of SLAM based on taichi-Lang designed for the aerial swarm.

Fully-automated scripts for collecting AI-related papers

A set of tools for creating and testing machine learning features, with a scikit-learn compatible API

MinkLoc++: Lidar and Monocular Image Fusion for Place Recognition

Pytorch implementation of the paper: "SAPNet: Segmentation-Aware Progressive Network for Perceptual Contrastive Image Deraining"

✂️ EyeLipCropper is a Python tool to crop eyes and mouth ROIs of the given video.

Memory-efficient optimum einsum using opt_einsum planning and PyTorch kernels.

Re-TACRED: Addressing Shortcomings of the TACRED Dataset

Tool for working with Y-chromosome data from YFull and FTDNA

PyGCL: Graph Contrastive Learning Library for PyTorch

Teaching end to end workflow of deep learning

Pointer-generator - Code for the ACL 2017 paper Get To The Point: Summarization with Pointer-Generator Networks

Official implementation of the paper Momentum Capsule Networks (MoCapsNet)

Python implementation of "Single Image Haze Removal Using Dark Channel Prior"

Official Pytorch implementation of the paper "Action-Conditioned 3D Human Motion Synthesis with Transformer VAE", ICCV 2021

Pytorch codes for "Self-supervised Multi-view Stereo via Effective Co-Segmentation and Data-Augmentation"

Julia and Matlab codes to simulated all problems in El-Hachem, McCue and Simpson (2021)