VLG-Net: Video-Language Graph Matching Networks for Video Grounding

Last update: Dec 04, 2022

Related tags

Overview

VLG-Net: Video-Language Graph Matching Networks for Video Grounding

Introduction

Official repository for VLG-Net: Video-Language Graph Matching Networks for Video Grounding. [ArXiv Preprint]

The paper is accepted to the first edition fo the ICCV workshop: AI for Creative Video Editing and Understanding (CVEU).

Installation

Clone the repository and move to folder:

git clone https://github.com/Soldelli/VLG-Net.git
cd VLG-Net

Install environmnet:

conda env create -f environment.yml

If installation fails, please follow the instructions in file doc/environment.md (link).

Data

Download the following resources and extract the content in the appropriate destination folder. See table.

Resource	Download Link	File Size	Destination Folder
StandfordCoreNLP-4.0.0	link	(~0.5GB)	`./datasets/`
TACoS	link	(~0.5GB)	`./datasets/`
ActivityNet-Captions	link	(~29GB)	`./datasets/`
DiDeMo	link	(~13GB)	`./datasets/`
GCNeXt warmup	link	(~0.1GB)	`./datasets/`
Pretrained Models	link	(~0.1GB)	`./models/`

The folder structure should be as follows:

.
├── configs
│
├── datasets
│   ├── activitynet1.3
│   │    ├── annotations
│   │    └── features
│   ├── didemo
│   │    ├── annotations
│   │    └── features
│   ├── tacos
│   │    ├── annotations
│   │    └── features
│   ├── gcnext_warmup
│   └── standford-corenlp-4.0.0
│
├── doc
│
├── lib
│   ├── config
│   ├── data
│   ├── engine
│   ├── modeling
│   ├── structures
│   └── utils
│
├── models
│   ├── activitynet
│   └── tacos
│
├── outputs
│
└── scripts

Training

Copy paste the following commands in the terminal.

Load environment:

conda activate vlg

For ActivityNet-Captions dataset, run:

python train_net.py --config-file configs/activitynet.yml OUTPUT_DIR outputs/activitynet

For TACoS dataset, run:

python train_net.py --config-file configs/tacos.yml OUTPUT_DIR outputs/tacos

Evaluation

For simplicity we provide scripts to automatically run the inference on pretrained models. See script details if you want to run inference on a different model.

Load environment:

conda activate vlg

Then run one of the following scripts to launch the evaluation.

For ActivityNet-Captions dataset, run:

    bash scripts/activitynet.sh

For TACoS dataset, run:

    bash scripts/tacos.sh

Expected results:

After cleaning the code and fixing a couple of minor bugs, performance changed (slightly) with respect to reported numbers in the paper. See below table.

ActivityNet	[email protected]	[email protected]	[email protected]	[email protected]
Paper	46.32	29.82	77.15	63.33
Current	46.32	29.79	77.19	63.36

TACoS	[email protected]	[email protected]	[email protected]	[email protected]	[email protected]	[email protected]
Paper	57.21	45.46	34.19	81.80	70.38	56.56
Current	57.16	45.56	34.14	81.48	70.13	56.34

Citation

If any part of our paper and code is helpful to your work, please cite with:

@inproceedings{soldan2021vlg,
  title={VLG-Net: Video-Language Graph Matching Network for Video Grounding},
  author={Soldan, Mattia and Xu, Mengmeng and Qu, Sisi and Tegner, Jesper and Ghanem, Bernard},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={3224--3234},
  year={2021}
}

VLG-Net: Video-Language Graph Matching Networks for Video Grounding

Related tags

Overview

VLG-Net: Video-Language Graph Matching Networks for Video Grounding

Introduction

Installation

Data

Training

Evaluation

Expected results:

Citation

Owner

Mattia Soldan

Code for reproducing our paper: LMSOC: An Approach for Socially Sensitive Pretraining

Data for "Driving the Herd: Search Engines as Content Influencers" paper

This repository contains code demonstrating the methods outlined in Path Signature Area-Based Causal Discovery in Coupled Time Series presented at Causal Analysis Workshop 2021.

Retrieve and analysis data from SDSS (Sloan Digital Sky Survey)

Simple Dynamic Batching Inference

Vector Neurons: A General Framework for SO(3)-Equivariant Networks

Convert game ISO and archives to CD CHD for emulation on Linux.

A curated list of awesome game datasets, and tools to artificial intelligence in games

Delta Conformity Sociopatterns Analysis - Delta Conformity Sociopatterns Analysis

A real-time motion capture system that estimates poses and global translations using only 6 inertial measurement units

AITUS - An atomatic notr maker for CYTUS

Diffgram - Supervised Learning Data Platform

Code for paper "Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs"

Implementation of UNet on the Joey ML framework

Minimalistic PyTorch training loop

Official code for ICCV2021 paper "M3D-VTON: A Monocular-to-3D Virtual Try-on Network"

PyTorch implementation of PNASNet-5 on ImageNet

Pytorch code for paper "Image Compressed Sensing Using Non-local Neural Network" TMM 2021.

CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

Training RNNs as Fast as CNNs