MoCoGAN: Decomposing Motion and Content for Video Generation

Last update: Dec 18, 2022

Overview

MoCoGAN: Decomposing Motion and Content for Video Generation

This repository contains an implementation and further details of MoCoGAN: Decomposing Motion and Content for Video Generation by Sergey Tulyakov, Ming-Yu Liu, Xiaodong Yang, Jan Kautz.

CVPR Poster:

Representation

MoCoGAN is a generative model for videos, which generates videos from random inputs. It features separated representations of motion and content, offering control over what is generated. For example, MoCoGAN can generate the same object performing different actions, as well as the same action performed by different objects

Examples of generated videos

We trained MoCoGAN on the MUG Facial Expression Database to generate facial expressions. When fixing the content code and changing the motion code, it generated the same person performs different expressions. When fixing the motion code and changing the content code, it generated different people performs the same expression. In the figure shown below, each column has fixed identity, each row shows the same action:

We trained MoCoGAN on a human action dataset where content is represented by the performer, executing several actions. When fixing the content code and changing the motion code, it generated the same person performs different actions. When fixing the motion code and changing the content code, it generated different people performs the same action. Each pair of images represents the same action executed by different people:

We have collected a large-scale TaiChi dataset including 4.5K videos of TaiChi performers. Below are videos generated by MoCoGAN.

Training MoCoGAN

Please refer to a wiki page

Citation

If you use MoCoGAN in your research please cite our paper:

Sergey Tulyakov, Ming-Yu Liu, Xiaodong Yang, Jan Kautz, "MoCoGAN: Decomposing Motion and Content for Video Generation"

@inproceedings{Tulyakov:2018:MoCoGAN,
 title={{MoCoGAN}: Decomposing motion and content for video generation},
 author={Tulyakov, Sergey and Liu, Ming-Yu and Yang, Xiaodong and Kautz, Jan},
 booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
 pages = {1526--1535},
 year={2018}
}

MoCoGAN: Decomposing Motion and Content for Video Generation

Related tags

Overview

MoCoGAN: Decomposing Motion and Content for Video Generation

Representation

Examples of generated videos

Training MoCoGAN

Citation

Other implementations:

Owner

Sergey Tulyakov

Based on the paper "Geometry-aware Instance-reweighted Adversarial Training" ICLR 2021 oral

Exploring Classification Equilibrium in Long-Tailed Object Detection, ICCV2021

TorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision

PoseViz – Multi-person, multi-camera 3D human pose visualization tool built using Mayavi.

Img-process-manual - Utilize Python Numpy and Matplotlib to realize OpenCV baisc image processing function

Piotr - IoT firmware emulation instrumentation for training and research

Neural-fractal - Create Fractals Using Complex-Valued Neural Networks!

Video Background Music Generation with Controllable Music Transformer (ACM MM 2021 Oral)

Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs (CIKM 2020)

Unofficial implementation of PatchCore anomaly detection

Code for Contrastive-Geometry Networks for Generalized 3D Pose Transfer

Motion planning algorithms commonly used on autonomous vehicles. (path planning + path tracking)

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

MiraiML: asynchronous, autonomous and continuous Machine Learning in Python

A Tensorfflow implementation of Attend, Infer, Repeat

Transparent Transformer Segmentation

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021)

Automatic library of congress classification, using word embeddings from book titles and synopses.

DROPO: Sim-to-Real Transfer with Offline Domain Randomization

Paddle implementation for "Cross-Lingual Word Embedding Refinement by ℓ1 Norm Optimisation" (NAACL 2021)