A Comparative Review of Recent Kinect-Based Action Recognition Algorithms (TIP2020, Matlab codes)

Related tags

Deep LearningHDG
Overview

A Comparative Review of Recent Kinect-Based Action Recognition Algorithms

This repo contains:

  • the HDG implementation (Matlab codes) for 'Analysis and Evaluation of Kinect-based Action Recognition Algorithms', and
  • provides the links (google drive) for downloading the algorithms evaluated in our TIP journal and
  • provides direct links (google drive) to download 5 smaller datasets for action recognition research.

1 Introduction

This repository contains the implementation of HDG presented in the following paper:

[1] Lei Wang, 2017. Analysis and Evaluation of Kinect-based Action Recognition Algorithms. Master's thesis. School of Computer Science and Software Engineering, The University of Western Australia. [ArXiv] [BibTex]

[2] Lei Wang, Du Q. Huynh, and Piotr Koniusz. A Comparative Review of Recent Kinect-Based Action Recognition Algorithms. IEEE Transactions on Image Processing, 29: 15-28, 2020. [ArXiv] [BibTex]

We also provide the links for downloading the algorithms/datasets used in our TIP paper.

2 Other algorithms compared in TIP paper

You can download other algorithms we evaluated in TIP paper from the following links:

3 Datasets used in TIP paper

3.1 Five Smaller datasets

3.1.1 Depth+Skeleton

You can directly download the depth+skeleton sequences for the following smaller datasets here:

The above 5 downloaded datasets contain depth + skeleton data, which you can directly use for HDG algorithm in this repo:

  • unzip a dataset, and
  • put the Dataset folder into HDG folder, then
  • extract the features (refer to following sections for more details).

3.1.2 Depth video only

For downloading the UWA3DActivity+UWA3D Multiview Activity II depth only, you can use this link(extraction code: 172h).

For downloading the CAD-60 depth only, please use this link (extraction code: 36wt)

3.2 Big datasets (NTU RGB+D)

For big datasets such as NTU-60 and NTU-120, please refer to this link for the request to download.

4 Run the codes of HDG

This is an implementation based on Rahmani et al.’s paper ‘Real Time Action Recognition Using Histograms of Depth Gradients and Random Decision Forests’ (WACV2014).

To run our new HDG algorithm (which is analysed and compared in our TIP2020 paper):

4.0 A glance of skeleton configuration

To know more detailed information about the skeleton configuration/graph, please refer to the pdf file attached in this repo.

UWAS denotes the skeleton configuration for UWA3D Activity, and UWAW is for UWA3D Multiview Activity II.

4.1 Data preparation

  • Go to the 'Dataset' folder, then go to the 'depth' folder and copy all depth sequence in this folder (should be .mat format and the internal data has the same name 'inDepthVideo').

  • After that go to the 'skeleton' folder, copy all skeleton sequence (the skeleton sequence should also be .mat format and each skeleton sequence has the following dimension: #jointsx3x#frames, here 3 represents x, y and d respectively), the internal data has the same name 'skeletonsequence'.

4.2 Feature extraction and concatenation

  • Go to the 'MATLAB_Codes' folder, run each 'main' in each algorithm folder(in the order of 00, 01, 02 and 03), and then run 'main' in 'feature_concatenating'. You can also run '02' and '03' first and then run '00' and '01', since '00' may need more time for segmenting the foreground (around 6 hours) and '01' is based on the results of '00'.

  • For UWAMultiview dataset, remember to change the video sequence from uint16 to double using im2double before running each main in 00 and 01: in both 00 and 01 folders, in main function line 33 & 17, change depthsequence=actionvolume; to depthsequence=im2double(actionvolume);.

  • For feature concatenating, you can select different combinations of features for classification. There are four features, which are:

    • hod(histogram of depth),
    • hodg(histogram of depth gradients),
    • jmv(joint movement volume features) and
    • jpd(joint position differences features).
  • Remember to change the number of joints and the torso joint ID in the 'main' of '02' and '03' since different datasets have different number of joints and torso joint IDs (refer to the pdf attached in this repo for the skeleton configuration).

    • MSRPairs (3D Action Pairs): 20 joints, torso joint ID is '2';
    • MSRAction3D: 20 joints, torso joint ID is '4';
    • CAD-60: 15 joints, torso joint ID is '3';
    • UWA3D single view dataset (UWA3D Activity): 15 joints, torso joint ID is '9';
    • UWA3D multi view dataset (UWA3D Multiview Activity II): 15 joints, torso joint ID is '3';

4.3 Classification

  • Run 'main' of random decision forests (Lei uses different 'main' for different datasets since different datasets should have different training and testing datasets). In Lei's implementation, half of data are used for training and the remaining half for testing.

    • MSRPairs (3D Action Pairs): msrpairsmain.m
    • MSRAction3D: msr3dmain.m
    • CAD-60: cadmain.m
    • UWA3D single view (UWA3D Activity): uwasinglemain.m
    • UWA3D multi view (UWA3D Multiview Activity II): uwamultimain.m

4.4 Visualization (i.e., confusion matrix)

  • The results of the confusion matrix will be saved in the 'Results' folder, and the confusion matrix will be displayed. Moreover, the total accuracy will appear in the workspace of the MATLAB.

4.4.1 Save figures to pdf format

  • saveTightFigure function is downloaded from online resource, which can be used to save the confusion matrix plot as pdf files. The use of this function is, for example: saveTightFigure(gcf, 'uwamultiview.pdf');

Codes for parameters evaluation, and running over all possible combinations of selecting half subjects (for training) are not provided in this repo.

For more information, please refer to my research report and our journal paper, or contact me.

5 Citations

You can cite the following papers for the use of this work:

@mastersthesis{lei_thesis_2017,
  author       = {Lei Wang}, 
  title        = {Analysis and Evaluation of {K}inect-based Action Recognition Algorithms},
  school       = {School of the Computer Science and Software Engineering, The University of Western Australia},
  year         = 2017,
  month        = {Nov}
}
@article{lei_tip_2019,
author={Lei Wang and Du Q. Huynh and Piotr Koniusz},
journal={IEEE Transactions on Image Processing},
title={A Comparative Review of Recent Kinect-Based Action Recognition Algorithms},
year={2020},
volume={29},
number={},
pages={15-28},
doi={10.1109/TIP.2019.2925285},
ISSN={1941-0042},
month={},}

Acknowledgments

I am grateful to Associate Professor Du Huynh for her valuable suggestions and discussions. We would like to thank the authors of HON4D, HOPC, LARP-SO, HPM+TM, IndRNN and ST-GCN for making their codes publicly available. We thank the ROSE Lab of Nanyang Technological University(NTU), Singapore, for making the NTU RGB+D dataset freely accessible.

Owner
Lei Wang
PhD student, Machine Learning/Computer Vision Researcher
Lei Wang
(to be released) [NeurIPS'21] Transformers Generalize DeepSets and Can be Extended to Graphs and Hypergraphs

Higher-Order Transformers Kim J, Oh S, Hong S, Transformers Generalize DeepSets and Can be Extended to Graphs and Hypergraphs, NeurIPS 2021. [arxiv] W

Jinwoo Kim 44 Dec 28, 2022
Complex Answer Generation For Conversational Search Systems.

Complex Answer Generation For Conversational Search Systems. Code for Does Structure Matter? Leveraging Data-to-Text Generation for Answering Complex

Hanane Djeddal 0 Dec 06, 2021
Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch

🦩 Flamingo - Pytorch Implementation of Flamingo, state-of-the-art few-shot visual question answering attention net, in Pytorch. It will include the p

Phil Wang 630 Dec 28, 2022
Personals scripts using ageitgey/face_recognition

HOW TO USE pip3 install requirements.txt Add some pictures of known people in the folder 'people' : a) Create a folder called by the name of the perso

Antoine Bollengier 1 Jan 06, 2022
TensorFlow implementation of ENet

TensorFlow-ENet TensorFlow implementation of ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation. This model was tested on th

Kwotsin 255 Oct 17, 2022
Learning RGB-D Feature Embeddings for Unseen Object Instance Segmentation

Unseen Object Clustering: Learning RGB-D Feature Embeddings for Unseen Object Instance Segmentation Introduction In this work, we propose a new method

NVIDIA Research Projects 132 Dec 13, 2022
PyTorch wrapper for Taichi data-oriented class

Stannum PyTorch wrapper for Taichi data-oriented class PRs are welcomed, please see TODOs. Usage from stannum import Tin import torch data_oriented =

86 Dec 23, 2022
An efficient framework for reinforcement learning.

rl: An efficient framework for reinforcement learning Requirements Introduction PPO Test Requirements name version Python =3.7 numpy =1.19 torch =1

16 Nov 30, 2022
Controlling the MicriSpotAI robot from scratch

Abstract: The SpotMicroAI project is designed to be a low cost, easily built quadruped robot. The design is roughly based off of Boston Dynamics quadr

Florian Wilk 405 Jan 05, 2023
An implementation of Geoffrey Hinton's paper "How to represent part-whole hierarchies in a neural network" in Pytorch.

GLOM An implementation of Geoffrey Hinton's paper "How to represent part-whole hierarchies in a neural network" for MNIST Dataset. To understand this

50 Oct 19, 2022
[ECE NTUA] 👁 Computer Vision - Lab Projects & Theoretical Problem Sets (2020-2021)

Computer Vision - NTUA (2020-2021) This repository hosts the lab projects and theoretical problem sets of the Computer Vision course held by ECE NTUA

Dimitris Dimos 6 Jul 21, 2022
No Code AI/ML platform

NoCodeAIML No Code AI/ML platform - Community Edition Video credits: Uday Kiran Typical No Code AI/ML Platform will have features like drag and drop,

Bhagvan Kommadi 5 Jan 28, 2022
Code of the lileonardo team for the 2021 Emotion and Theme Recognition in Music task of MediaEval 2021

Emotion and Theme Recognition in Music The repository contains code for the submission of the lileonardo team to the 2021 Emotion and Theme Recognitio

Vincent Bour 8 Aug 02, 2022
Deep learning models for change detection of remote sensing images

Change Detection Models (Remote Sensing) Python library with Neural Networks for Change Detection based on PyTorch. ⚡ ⚡ ⚡ I am trying to build this pr

Kaiyu Li 176 Dec 24, 2022
FedMM: Saddle Point Optimization for Federated Adversarial Domain Adaptation

This repository contains the code accompanying the paper " FedMM: Saddle Point Optimization for Federated Adversarial Domain Adaptation" Paper link: R

20 Jun 29, 2022
I3-master-layout - Simple master and stack layout script

Simple master and stack layout script | ------ | ----- | | | | | Ma

Tobias S 18 Dec 05, 2022
This script scrapes and stores the availability of timeslots for Car Driving Test at all RTA Serivce NSW centres in the state.

This script scrapes and stores the availability of timeslots for Car Driving Test at all RTA Serivce NSW centres in the state. Dependencies Account wi

Balamurugan Soundararaj 21 Dec 14, 2022
Translate darknet to tensorflow. Load trained weights, retrain/fine-tune using tensorflow, export constant graph def to mobile devices

Intro Real-time object detection and classification. Paper: version 1, version 2. Read more about YOLO (in darknet) and download weight files here. In

Trieu 6.1k Dec 30, 2022
TensorRT examples (Jetson, Python/C++)(object detection)

TensorRT examples (Jetson, Python/C++)(object detection)

Nobuo Tsukamoto 53 Dec 22, 2022
Simple and understandable swin-transformer OCR project

swin-transformer-ocr ocr with swin-transformer Overview Simple and understandable swin-transformer OCR project. The model in this repository heavily r

Ha YongWook 67 Dec 31, 2022