PyTorch implementation for the visual prior component (i.e. perception module) of the Visually Grounded Physics Learner [Li et al., 2020].

Overview

VGPL-Visual-Prior

PyTorch implementation for the visual prior component (i.e. perception module) of the Visually Grounded Physics Learner (VGPL). Given visual obseravtions, the visual prior proposes their corresponding particle representations, in the form of particle positions and groupings. Please see the following paper for more details.

Visual Grounding of Learned Physical Models

Yunzhu Li, Toru Lin*, Kexin Yi*, Daniel M. Bear, Daniel L. K. Yamins, Jiajun Wu, Joshua B. Tenenbaum, and Antonio Torralba

ICML 2020 [website] [paper] [video]

Demo

Input RGB videos and predictions from our learned model

Prerequisites

  • Python 3
  • PyTorch 1.0 or higher, with NVIDIA CUDA Support
  • Other required packages in requirements.txt

Code overview

Helper files

config.py contains all configurations used for model training, model evaluation and output generation.

dataset.py contains helper functions for loading and standardizing data and related variables. Note that paths to data directories is specified in the _DATA_DIR variable in this file, not in config.py.

loss.py contains helper functions for calculating Chamfer loss in different settings (e.g. in a single frame, across a time sequence, etc.).

model.py implements the neural network model used for prediction.

Main files

The following files can be run directly; see "Training and evaluation" section for more details.

train.py trains a model that could convert input observations into their particle representations.

eval.py evaluates a trained model by visualizing its predictions, and/or stores the output predictions in .h5 format.

Training and evaluation

Download the training and evaluation data from the following links, and put them in data folder. Optionally, download our trained model checkpoints and put them in dump folder.

To train a model:

python train.py --set loss_type l2 dataset RigidFall

To debug (by overfitting model on small batch of data):

python train.py --set loss_type l2 dataset RigidFall debug True

To evaluate a trained model and generate outputs using our provided checkpoints:

python eval.py --set loss_type l2 dataset RigidFall n_frames 4 n_frames_eval 30 load_path dump/rigid_fall_4frame_l2.pth
python eval.py --set loss_type l2 dataset MassRope n_frames 4 n_frames_eval 30 load_path dump/mass_rope_4frame_l2.pth

See config.py for more details on customizable configurations.

Citing VGPL

If you find this codebase useful in your research, please consider citing:

@inproceedings{li2020visual,
    Title={Visual Grounding of Learned Physical Models},
    Author={Li, Yunzhu and Lin, Toru and Yi, Kexin and Bear, Daniel and Yamins, Daniel L.K. and Wu, Jiajun and Tenenbaum, Joshua B. and Torralba, Antonio},
    Booktitle={ICML},
    Year={2020}
}

@inproceedings{li2019learning,
    Title={Learning Particle Dynamics for Manipulating Rigid Bodies, Deformable Objects, and Fluids},
    Author={Li, Yunzhu and Wu, Jiajun and Tedrake, Russ and Tenenbaum, Joshua B and Torralba, Antonio},
    Booktitle={ICLR},
    Year={2019}
}
Owner
Toru
Toru
https://arxiv.org/abs/2102.11005

LogME LogME: Practical Assessment of Pre-trained Models for Transfer Learning How to use Just feed the features f and labels y to the function, and yo

THUML: Machine Learning Group @ THSS 149 Dec 19, 2022
The Agriculture Domain of ERPNext comes with features to record crops and land

Agriculture The Agriculture Domain of ERPNext comes with features to record crops and land, track plant, soil, water, weather analytics, and even trac

Frappe 21 Jan 02, 2023
Weakly Supervised Posture Mining with Reverse Cross-entropy for Fine-grained Classification

Fine-grainedImageClassification Weakly Supervised Posture Mining with Reverse Cross-entropy for Fine-grained Classification We trained model here: lin

ZhenchaoTang 14 Oct 21, 2022
Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021)

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021) Citation Please cite as: @inproceedings{liu2020understan

Sunbow Liu 22 Nov 25, 2022
Pixel-Perfect Structure-from-Motion with Featuremetric Refinement (ICCV 2021, Oral)

Pixel-Perfect Structure-from-Motion (ICCV 2021 Oral) We introduce a framework that improves the accuracy of Structure-from-Motion by refining keypoint

Computer Vision and Geometry Lab 831 Dec 29, 2022
MegEngine implementation of YOLOX

Introduction YOLOX is an anchor-free version of YOLO, with a simpler design but better performance! It aims to bridge the gap between research and ind

旷视天元 MegEngine 77 Nov 22, 2022
An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.

CPC_audio This code implements the Contrast Predictive Coding algorithm on audio data, as described in the paper Unsupervised Pretraining Transfers we

Meta Research 283 Dec 30, 2022
In-Place Activated BatchNorm for Memory-Optimized Training of DNNs

In-Place Activated BatchNorm In-Place Activated BatchNorm for Memory-Optimized Training of DNNs In-Place Activated BatchNorm (InPlace-ABN) is a novel

1.3k Dec 29, 2022
unofficial pytorch implement of "Squareplus: A Softplus-Like Algebraic Rectifier"

SquarePlus (Pytorch implement) unofficial pytorch implement of "Squareplus: A Softplus-Like Algebraic Rectifier" SquarePlus Squareplus is a Softplus-L

SeeFun 3 Dec 29, 2021
KAPAO is an efficient multi-person human pose estimation model that detects keypoints and poses as objects and fuses the detections to predict human poses.

KAPAO (Keypoints and Poses as Objects) KAPAO is an efficient single-stage multi-person human pose estimation model that models keypoints and poses as

Will McNally 664 Dec 30, 2022
Code for the ICCV 2021 paper "Pixel Difference Networks for Efficient Edge Detection" (Oral).

Microsoft365_devicePhish Abusing Microsoft 365 OAuth Authorization Flow for Phishing Attack This is a simple proof-of-concept script that allows an at

Alex 236 Dec 21, 2022
Multi-task Self-supervised Object Detection via Recycling of Bounding Box Annotations (CVPR, 2019)

Multi-task Self-supervised Object Detection via Recycling of Bounding Box Annotations (CVPR 2019) To make better use of given limited labels, we propo

126 Sep 13, 2022
This repository attempts to replicate the SqueezeNet architecture and implement the same on an image classification task.

SqueezeNet-Implementation This repository attempts to replicate the SqueezeNet architecture using TensorFlow discussed in the research paper: "Squeeze

Rohan Mathur 3 Dec 13, 2022
Framework for training options with different attention mechanism and using them to solve downstream tasks.

Using Attention in HRL Framework for training options with different attention mechanism and using them to solve downstream tasks. Requirements GPU re

5 Nov 03, 2022
AI-based, context-driven network device ranking

Batea A batea is a large shallow pan of wood or iron traditionally used by gold prospectors for washing sand and gravel to recover gold nuggets. Batea

Secureworks Taegis VDR 269 Nov 26, 2022
EdMIPS: Rethinking Differentiable Search for Mixed-Precision Neural Networks

EdMIPS is an efficient algorithm to search the optimal mixed-precision neural network directly without proxy task on ImageNet given computation budgets. It can be applied to many popular network arch

Zhaowei Cai 47 Dec 30, 2022
Face Depixelizer based on "PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models" repository.

NOTE We have noticed a lot of concern that PULSE will be used to identify individuals whose faces have been blurred out. We want to emphasize that thi

Denis Malimonov 2k Dec 29, 2022
A deep learning network built with TensorFlow and Keras to classify gender and estimate age.

Convolutional Neural Network (CNN). This repository contains a source code of a deep learning network built with TensorFlow and Keras to classify gend

Pawel Dziemiach 1 Dec 19, 2021
TensorFlow implementation of the paper "Hierarchical Attention Networks for Document Classification"

Hierarchical Attention Networks for Document Classification This is an implementation of the paper Hierarchical Attention Networks for Document Classi

Quoc-Tuan Truong 83 Dec 05, 2022
Camera ready code repo for the NeuRIPS 2021 paper: "Impression learning: Online representation learning with synaptic plasticity".

Impression-Learning-Camera-Ready Camera ready code repo for the NeuRIPS 2021 paper: "Impression learning: Online representation learning with synaptic

2 Feb 09, 2022