Video-Captioning - A machine Learning project to generate captions for video frames indicating the relationship between the objects in the video

Last update: Jan 23, 2022

Related tags

Deep Learning Video-Captioning

Overview

Video-Captioning

A machine Learning project to generate captions for video frames indicating the relationship between the objects in the video.

Approach

In our framework we use a sequence-to-sequence model to perform video visual relationship predictions where the input is a sequence of video frames and the output is a relation triplet < object1 − relationship − object2 > representing the videos. We extend the sequence-to-sequence modelling approach to an input of sequence of video frames.

Figure: Bidirectional LSTM layer (coloured red) encodes visual feature inputs, and the LSTM layer (coloured green) decodes the features into a sequence of words.

Results

Python Dependencies

Pandas
Keras
Tensorflow
Numpy
albumenations
Pillow

Procedure

Training

For training the model, run the script train.py.

  python train.py

For training on your own dataset: Save your data in a directory (for the format check the data folder). Update the json files.

object1_object2.json: It contains a dictionary for each object, with object labels as keys and ids as values.
relationship.json: It contains a dictionary for each relationship, with relationship labels as keys and ids as values.
training_annotations.json: It contains a dictionary for each video in the training data, with video ids as keys and a list of as values.

While running the script provide your directory path.

  python eval.py --train_data

Testing

For testing the model or making predictions on your own dataset, run the script eval.py.

  python eval.py --test_data

Result will be saved to a csv file 'test_data_predictions.csv'.

Video-Captioning - A machine Learning project to generate captions for video frames indicating the relationship between the objects in the video

Related tags

Overview

Video-Captioning

Approach

Results

Python Dependencies

Procedure

Training

Testing

Owner

DeOldify - A Deep Learning based project for colorizing and restoring old images (and video!)

A machine learning project which can detect and predict the skin disease through image recognition.

Official PyTorch implementation of the paper "TEMOS: Generating diverse human motions from textual descriptions"

Source code for our EMNLP'21 paper 《Raise a Child in Large Language Model: Towards Effective and Generalizable Fine-tuning》

Torchyolo - Yolov3 ve Yolov4 modellerin Pytorch uygulamasıdır

《Towards High Fidelity Face Relighting with Realistic Shadows》(CVPR 2021)

The official implementation for ACL 2021 "Challenges in Information Seeking QA: Unanswerable Questions and Paragraph Retrieval".

Self-supervised learning (SSL) is a method of machine learning

Learning to Communicate with Deep Multi-Agent Reinforcement Learning in PyTorch

Skipgram Negative Sampling in PyTorch

Vertex AI: Serverless framework for MLOPs (ESP / ENG)

A Lighting Pytorch Framework for Recommendation System, Easy-to-use and Easy-to-extend.

Tensorflow2 Keras-based Semantic Segmentation Models Implementation

Stratified Transformer for 3D Point Cloud Segmentation (CVPR 2022)

High accurate tool for automatic faces detection with landmarks

Make Watson Assistant send messages to your Discord Server

Data stream analytics: Implement online learning methods to address concept drift in data streams using the River library. Code for the paper entitled "PWPAE: An Ensemble Framework for Concept Drift Adaptation in IoT Data Streams" accepted in IEEE GlobeCom 2021.

Python implementation of Bayesian optimization over permutation spaces.

This repository is an unoffical PyTorch implementation of Medical segmentation in 3D and 2D.

This program uses trial auth token of Azure Cognitive Services to do speech synthesis for you.