Activity image-based video retrieval

Last update: Oct 21, 2021

Related tags

Overview

Cross-modal-retrieval

Our approach is focus on Activity Image-to-Video Retrieval (AIVR) task. The compared methods are state-of-the-art single modality hashing methods, multiple modalities hashing methods and cross-modal retrieval methods.

Single modality hashing methods

Some hashing baselines for image retrieval can be found in https://github.com/willard-yuan/hashing-baseline-for-image-retrieval.

Multiple modalities hashing methods

More details refer to https://github.com/czxxjtu/Hash-Learning.github.io. Some details about hashing methods are in hashing-baseline-for-image-retrieval-master folder.

Cross-modal retrieval methods

The compared cross-modal retrieval methods are according to the paper:

Datasets

THUMOS'14 Dataset:

https://pan.baidu.com/s/1H6c8nh_Hs7gVkhESpxtvAg 提取码：qp26

ActivityNet Dataset:

https://pan.baidu.com/s/1P0jRecEmplCPaTPwFoOpVQ 提取码：pnw9

Bibtex

When using images from our dataset, please cite our paper using the following BibTeX[PDF]：

@article{pba2020,
author    = {Ruicong Xu and Li Niu and Jianfu Zhang and Liqing Zhang},
title     = {A Proposal-based Approach for Activity Image-to-Video Retrieval},
journal   = {AAAI},
year      = {2020}}

Activity image-based video retrieval

Related tags

Overview

Cross-modal-retrieval

Single modality hashing methods

Multiple modalities hashing methods

Cross-modal retrieval methods

Datasets

THUMOS'14 Dataset:

ActivityNet Dataset:

Bibtex

Owner

BCMI

Repository of Vision Transformer with Deformable Attention

Some methods for comparing network representations in deep learning and neuroscience.

Multi-Agent Reinforcement Learning (MARL) method to learn scalable control polices for multi-agent target tracking.

RefineMask (CVPR 2021)

Code to reproduce experiments in the paper "Explainability Requires Interactivity".

Bridging Composite and Real: Towards End-to-end Deep Image Matting

PyTorch implementation of PP-LCNet: A Lightweight CPU Convolutional Neural Network

Code Repository for The Kaggle Book, Published by Packt Publishing

GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond

Pytorch Implementation of DiffSinger: Diffusion Acoustic Model for Singing Voice Synthesis (TTS Extension)

Code repository for the paper: Hierarchical Kinematic Probability Distributions for 3D Human Shape and Pose Estimation from Images in the Wild (ICCV 2021)

This repository contains the code needed to train Mega-NeRF models and generate the sparse voxel octrees

LiDAR R-CNN: An Efficient and Universal 3D Object Detector

Face and Body Tracking for VRM 3D models on the web.

某学校选课系统GIF验证码数据集 + Baseline模型 + 上下游相关工具

This is the source code of the 1st place solution for segmentation task (with Dice 90.32%) in 2021 CCF BDCI challenge.

End-to-end face detection, cropping, norm estimation, and landmark detection in a single onnx model

This repo contains implementation of different architectures for emotion recognition in conversations.

A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning

Use graph-based analysis to re-classify stocks and to improve Markowitz portfolio optimization