Kinetics-Data-Preprocessing

Overview

Kinetics-Data-Preprocessing

Kinetics-400 and Kinetics-600 are common video recognition datasets used by popular video understanding projects like SlowFast or PytorchVideo. However, their instruction of dataset preparation is too brief. Therefore, this project provides a more detailed instruction for Kinetics-400/-600 data preprocessing.

Download the raw videos

There are multiple ways to download the raw videos of Kinetics-400 and Kinetics-600. Here, I list two common choices that I found to be simple and fast:

  1. Download the videos via the official scripts. However, I noticed that this option is very slow, so I personally recommend the next choice.

  2. Download the compressed videos from the Common Visual Data Foundation Servers following the repository, which is much faster as they organized 650,000 independent video clips into several compressed files.

Resize the videos

The common data preprocessing of Kinetics requires all videos to be resized to the short edge size of 256. Therefore, I use the moviepy package to do so. The package can be easily installed by the following command:

pip install moviepy

Then, you can use the resize_video.py to resize all the videos within the given folder by following command:

python resize_video.py --size 256 --path YOUR_VIDEO_CONTAINER

IMPORTANT! Note that the resize_video.py will replace the original mp4 files. If you want to keep the original files, please make copys before resizing.

Prepare the csv annotation files

Following SlowFast, we also need to prepare the csv annotation files for training, validation, and testing set as train.csv, val.csv, test.csv. The format of the csv file is:

path_to_video_1 label_1
path_to_video_2 label_2
path_to_video_3 label_3
...
path_to_video_N label_N

The original annotations can be found at the kinetics website, or you can directly use download links of kinetics-400 annotations and kinetics-600 annotations. The official annotations support two different types of files: csv and json. However, both of them don't meet the above format. Therefore, I also provide a python code to transfer json files to the corresponding csv files with correct format. It takes two inputs: the container path of all videos, the path of official json annotation files. The output annotations will be named as 'output_XXX.csv' and located at the same folder. The label-to-id mapping dictionary will be saved as 'label2id.json'. The following command is my example.

python kinetics_annotation.py --train_path /home/kaihua/datasets/kinetics-train/ \
    --test_path /home/kaihua/datasets/kinetics-test/ \
    --val_path /home/kaihua/datasets/kinetics-val/ \
    --anno_path /home/kaihua/datasets/kinetics400-anno/
Owner
Kaihua Tang
@kaihuatang.github.io/
Kaihua Tang
Recognize Handwritten Digits using Deep Learning on the browser itself.

MNIST on the Web An attempt to predict MNIST handwritten digits from my PyTorch model from the browser (client-side) and not from the server, with the

Harjyot Bagga 7 May 28, 2022
Magic tool for managing internet connection in local network by @zalexdev

Megacut ✂️ A new powerful Python3 tool for managing internet on a local network Installation git clone https://github.com/stryker-project/megacut cd m

Stryker 12 Dec 15, 2022
Official PyTorch implementation of StyleGAN3

Modified StyleGAN3 Repo Changes Made tied to python 3.7 syntax .jpgs instead of .pngs for training sample seeds to recreate the 1024 training grid wit

Derrick Schultz (he/him) 83 Dec 15, 2022
Partial implementation of ODE-GAN technique from the paper Training Generative Adversarial Networks by Solving Ordinary Differential Equations

ODE GAN (Prototype) in PyTorch Partial implementation of ODE-GAN technique from the paper Training Generative Adversarial Networks by Solving Ordinary

Somshubra Majumdar 15 Feb 10, 2022
A generalized framework for prototyping full-stack cooperative driving automation applications under CARLA+SUMO.

OpenCDA OpenCDA is a SIMULATION tool integrated with a prototype cooperative driving automation (CDA; see SAE J3216) pipeline as well as regular autom

UCLA Mobility Lab 726 Dec 29, 2022
Vector Quantized Diffusion Model for Text-to-Image Synthesis

Vector Quantized Diffusion Model for Text-to-Image Synthesis Due to company policy, I have to set microsoft/VQ-Diffusion to private for now, so I prov

Shuyang Gu 294 Jan 05, 2023
The FIRST GANs-based omics-to-omics translation framework

OmiTrans Please also have a look at our multi-omics multi-task DL freamwork 👀 : OmiEmbed The FIRST GANs-based omics-to-omics translation framework Xi

Xiaoyu Zhang 6 Dec 14, 2022
Codebase for the Summary Loop paper at ACL2020

Summary Loop This repository contains the code for ACL2020 paper: The Summary Loop: Learning to Write Abstractive Summaries Without Examples. Training

Canny Lab @ The University of California, Berkeley 44 Nov 04, 2022
Quantized tflite models for ailia TFLite Runtime

ailia-models-tflite Quantized tflite models for ailia TFLite Runtime About ailia TFLite Runtime ailia TF Lite Runtime is a TensorFlow Lite compatible

ax Inc. 13 Dec 23, 2022
Neural network for digit classification powered by cuda

cuda_nn_mnist Neural network library for digit classification powered by cuda Resources The library was built to work with MNIST dataset. python-mnist

Nikita Ardashev 1 Dec 20, 2021
Builds a LoRa radio frequency fingerprint identification (RFFI) system based on deep learning techiniques

This project builds a LoRa radio frequency fingerprint identification (RFFI) system based on deep learning techiniques.

20 Dec 30, 2022
Code for Parameter Prediction for Unseen Deep Architectures (NeurIPS 2021)

Parameter Prediction for Unseen Deep Architectures (NeurIPS 2021) authors: Boris Knyazev, Michal Drozdzal, Graham Taylor, Adriana Romero-Soriano Overv

Facebook Research 462 Jan 03, 2023
Variational autoencoder for anime face reconstruction

VAE animeface Variational autoencoder for anime face reconstruction Introduction This repository is an exploratory example to train a variational auto

Minzhe Zhang 2 Dec 11, 2021
Nightmare-Writeup - Writeup for the Nightmare CTF Challenge from 2022 DiceCTF

Nightmare: One Byte to ROP // Alternate Solution TLDR: One byte write, no leak.

1 Feb 17, 2022
Code for the AAAI-2022 paper: Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification

Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification (AAAI 2022) Prerequisite PyTorch = 1.2.0 P

16 Dec 14, 2022
A new play-and-plug method of controlling an existing generative model with conditioning attributes and their compositions.

Viz-It Data Visualizer Web-Application If I ask you where most of the data wrangler looses their time ? It is Data Overview and EDA. Presenting "Viz-I

NVIDIA Research Projects 66 Jan 01, 2023
HW3 ― GAN, ACGAN and UDA

HW3 ― GAN, ACGAN and UDA In this assignment, you are given datasets of human face and digit images. You will need to implement the models of both GAN

grassking100 1 Dec 13, 2021
Codebase for Attentive Neural Hawkes Process (A-NHP) and Attentive Neural Datalog Through Time (A-NDTT)

Introduction Codebase for the paper Transformer Embeddings of Irregularly Spaced Events and Their Participants. This codebase contains two packages: a

Alan Yang 28 Dec 12, 2022
Learning to Stylize Novel Views

Learning to Stylize Novel Views [Project] [Paper] Contact: Hsin-Ping Huang ([ema

34 Nov 27, 2022