Automatic Video Captioning Evaluation Metric --- EMScore

Last update: Nov 28, 2022

Related tags

Deep Learning emscore

Overview

Automatic Video Captioning Evaluation Metric --- EMScore

Overview

For an illustration, EMScore can be computed as:

Installation

modify the encode_text() function in CLIP/clip/model.py as follows:

def encode_text(self, text, local=False):
    x = self.token_embedding(text).type(self.dtype)  # [batch_size, n_ctx, d_model]

    x = x + self.positional_embedding.type(self.dtype)
    x = x.permute(1, 0, 2)  # NLD -> LND
    x = self.transformer(x)
    x = x.permute(1, 0, 2)  # LND -> NLD
    x = self.ln_final(x).type(self.dtype)

    if local:
        x = x @ self.text_projection
    else:
        # x.shape = [batch_size, n_ctx, transformer.width]
        # take features from the eot embedding (eot_token is the highest number in each sequence)
        x = x[torch.arange(x.shape[0]), text.argmax(dim=-1)] @ self.text_projection
  
    return x

Push your modified CLIP to your GitHub.

Install

$ conda install --yes -c pytorch pytorch=1.7.1 torchvision cudatoolkit=11.0
$ pip install ftfy regex tqdm
$ pip install git+https://github.com/$Yours_GitHub_name/CLIP

Replace cudatoolkit=11.0 above with the appropriate CUDA version on your machine or cpuonly when installing on a machine without a GPU.

Usage:

A general demo

python demo.py

VATEX-EVAL

download the files in the following link, and save at a storage directory

https://drive.google.com/drive/folders/1jAfZZKEgkMEYFF2x1mhYo39nH-TNeGm6?usp=sharing

run code

python VATEX-EVAL-demo.py --storage_path $storage_path --use_n_refs 1 --use_feat_cache --use_idf

ActivityNet-FOIL

download the files in the following link, and save at a storage directory

https://drive.google.com/drive/folders/1oY9EJiEi_db_1GH-R33JDqfE8txffKR3?usp=sharing

run code

python ActivityNet-FOIL_demo.py --storage_path $storage_path --use_references --use_idf

Others

if you want extract embeddings by yourself:

python extract_video_embeddings.py --videos_path $your_video_path  --save_path $your_storage_path --backbone 'ViT-B/32'

Automatic Video Captioning Evaluation Metric --- EMScore

Related tags

Overview

Overview

Installation

Usage:

A general demo

VATEX-EVAL

ActivityNet-FOIL

Others

Owner

Yaya Shi

Safe Bayesian Optimization

Quantized models with python

This is a simple backtesting framework to help you test your crypto currency trading. It includes a way to download and store historical crypto data and to execute a trading strategy.

A Multi-attribute Controllable Generative Model for Histopathology Image Synthesis

Attention over nodes in Graph Neural Networks using PyTorch (NeurIPS 2019)

A Conditional Point Diffusion-Refinement Paradigm for 3D Point Cloud Completion

A basic duplicate image detection service using perceptual image hash functions and nearest neighbor search, implemented using faiss, fastapi, and imagehash

Python scripts form performing stereo depth estimation using the high res stereo model in PyTorch .

CSAW-M: An Ordinal Classification Dataset for Benchmarking Mammographic Masking of Cancer

Event-forecasting - Event Forecasting Algorithms With Python

FAVD: Featherweight Assisted Vulnerability Discovery

OMLT: Optimization and Machine Learning Toolkit

Recurrent Variational Autoencoder that generates sequential data implemented with pytorch

A platform for intelligent agent learning based on a 3D open-world FPS game developed by Inspir.AI.

TGS Salt Identification Challenge

AEI: Actors-Environment Interaction with Adaptive Attention for Temporal Action Proposals Generation

Investigating Attention Mechanism in 3D Point Cloud Object Detection (arXiv 2021)

Pytorch implementation of CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generation"

Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations. [2021]

Python interface for SmartRF Sniffer 2 Firmware