This repository contains a toolkit for collecting, labeling and tracking object keypoints

Overview

Object Keypoint Tracking

This repository contains a toolkit for collecting, labeling and tracking object keypoints. Object keypoints are semantic points in an object's coordinate frame.

The project allows collecting images from multiple viewpoints using a robot with a wrist mounted camera. These image sequences can then be labeled using an easy to use user interface, StereoLabel.

StereoLabel keypoint labeling

Once the images are labeled, a model can be learned to detect keypoints in the images and compute 3D keypoints in the camera's coordinate frame.

Installation

External Dependencies:

  • HUD
  • ROS melodic/noetic

Install HUD. Then install dependencies with pip install -r requirements.txt and finally install the package using pip3 install -e ..

Usage

Here we describe the process we used to arrive at our labeled datasets and learned models.

Calibration and setup

First, calibrate your camera and obtain a hand-eye-calibration. Calibrating the camera can be done using Kalibr. Hand-eye-calibration can be done with the ethz-asl/hand_eye_calibration or easy_handeye packages.

The software currently assumes that the Kalibr pinhole-equi camera model was used when calibrating the camera.

Kalibr will spit out a yaml file like the one at config/calibration.yaml. This should be passed in as the --calibration argument for label.py and other scripts.

Once you have obtained the hand-eye calibration, configure your robot description so that the tf tree correctly is able to transform poses from the base frame to the camera optical frame.

Collecting data

The script scripts/collect_bags.py is a helper program to assist in collecting data. It will use rosbag to record the camera topics and and transform messages.

Run it with python3 scripts/collect_bags.py --out .

Press enter to start recording a new sequence. Recording will start after a 5 second grace period, after which the topics will be recorded for 30 seconds. During the 30 seconds, slowly guide the robot arm to different viewpoints observing your target objects.

Encoding data

Since rosbag is not a very convenient or efficient format for our purposes, we encode the data into a format that is easier to work with and uses up less disk space. This is done using the script scripts/encode_bag.py.

Run it with python3 scripts/encode_bags.py --bags --out --calibration .

Labeling data

Valve

First decide how many keypoints you will use for your object class and what their configuration is. Write a keypoint configuration file, like config/valve.json and config/cups.json. For example, in the case of our valve above, we define four different keypoints, which are of two types. The first type is the center keypoint type and the second is the spoke keypoint type. For our valve, there are three spokes, so we write our keypoint configuration as:

{ "keypoint_config": [1, 3] }

What this means, is that there will first be a keypoint of the first type and then three keypoints of the next type. Save this file for later.

StereoLabel can be launched with python3 scripts/label.py . To label keypoints, click on the keypoints in the same order in each image. Make sure to label the points consistent with the keypoint configuration that you defined, so that the keypoints end up on the right heatmaps downstream.

If you have multiple objects in the scene, it is important that you annotate one object at the time, sticking to the keypoint order, as the tool makes the assumption that one object's keypoints follow each other. The amount of keypoints you label should equal the amount of objects times the total number of keypoints per object.

Once you have labeled an equal number of points on the left and right image, points will be backprojected, so that you can make sure that everything is correctly configured and that you didn't accidentally label the points in the wrong order. The points are saved at the same time to a file keypoints.json in each scene's directory.

Here are some keyboard actions the tool supports:

  • Press a to change the left frame with a random frame from the current sequence.
  • Press b to change the right frame with a random frame from the current sequence.
  • Press to go to next sequence, after you labeled a sequence.

Switching frames is especially useful, if for example in one viewpoint a keypoint is occluded and it is hard to annotate accurately.

Once the points have been saved and backprojected, you can freely press a and b to swap out the frames to different ones in the sequence. It will project the 3D points back into 2D onto the new frames. You can check that the keypoints project nicely to each frame. If not, you likely misclicked, the viewpoints are too close to each other, there could be an issue with your intrinsics or hand-eye calibration or the camera poses are not accurate for some other reason.

Checking the data

Once all your sequences have been labeled, you can check that the labels are correct on all frames using python scripts/show_keypoints.py , which will play the images one by one and show the backprojected points.

Learning a model

First, download the weights for the CornerNet backbone model. This can be done from the CornerNet repository. We use the CornerNet-Squeeze model. Place the file at models/corner_net.pkl.

You can train a model with python scripts/train.py --train --val . Where --train points to the directory containing your training scenes. --val points to the directory containing your validation scenes.

Once done, you can package a model with python scripts/package_model.py --model lightning_logs/version_x/checkpoints/ .ckpt --out model.pt

You can then run and check the metrics on a test set using python scripts/eval_model.py --model model.pt --keypoints .

General tips

Here are some general tips that might be of use:

  • Collect data at something like 4-5 fps. Generally, frames that are super close to each other aren't that useful and you don't really need every single frame. I.e. configure your camera node to only publish image messages at that rate.
  • Increase the publishing rate of your robot_state_publisher node to something like 100 or 200.
  • Move your robot slowly when collecting the data such that the time synchronization between your camera and robot is not that big of a problem.
  • Keep the scenes reasonable.
  • Collect data in all the operating conditions in which you will want to be detecting keypoints at.
Owner
ETHZ ASL
ETHZ ASL
Unity Propagation in Bayesian Networks Handling Inconsistency via Unity Smoothing

This repository contains the scripts needed to generate the results from the paper Unity Propagation in Bayesian Networks Handling Inconsistency via U

0 Jan 19, 2022
Torchserve server using a YoloV5 model running on docker with GPU and static batch inference to perform production ready inference.

Yolov5 running on TorchServe (GPU compatible) ! This is a dockerfile to run TorchServe for Yolo v5 object detection model. (TorchServe (PyTorch librar

82 Nov 29, 2022
CTF challenges from redpwnCTF 2021

redpwnCTF 2021 Challenges This repository contains challenges from redpwnCTF 2021 in the rCDS format; challenge information is in the challenge.yaml f

redpwn 27 Dec 07, 2022
Quantum-enhanced transformer neural network

Example of a Quantum-enhanced transformer neural network Get the code: git clone https://github.com/rdisipio/qtransformer.git cd qtransformer Create

Riccardo Di Sipio 61 Nov 08, 2022
Code for generating a single image pretraining dataset

Single Image Pretraining of Visual Representations As shown in the paper A critical analysis of self-supervision, or what we can learn from a single i

Yuki M. Asano 12 Dec 19, 2022
Control-Robot-Arm-using-PS4-Controller - A Robotic Arm based on Raspberry Pi and Arduino that controlled by PS4 Controller

Control-Robot-Arm-using-PS4-Controller You can see all details about this Robot

MohammadReza Sharifi 5 Jan 01, 2022
PyTorch implementation of HDN(Homography Decomposition Networks) for planar object tracking

Homography Decomposition Networks for Planar Object Tracking This project is the offical PyTorch implementation of HDN(Homography Decomposition Networ

CaptainHook 48 Dec 15, 2022
Hooks for VCOCO

Verbs in COCO (V-COCO) Dataset This repository hosts the Verbs in COCO (V-COCO) dataset and associated code to evaluate models for the Visual Semantic

Saurabh Gupta 131 Nov 24, 2022
Source code for "Pack Together: Entity and Relation Extraction with Levitated Marker"

PL-Marker Source code for Pack Together: Entity and Relation Extraction with Levitated Marker. Quick links Overview Setup Install Dependencies Data Pr

THUNLP 173 Dec 30, 2022
Infrastructure as Code (IaC) for a self-hosted version of Gnosis Safe on AWS

Welcome to Yearn Gnosis Safe! Setting up your local environment Infrastructure Deploying Gnosis Safe Prerequisites 1. Create infrastructure for secret

Numan 16 Jul 18, 2022
The PyTorch implementation of Directed Graph Contrastive Learning (DiGCL), NeurIPS-2021

Directed Graph Contrastive Learning The PyTorch implementation of Directed Graph Contrastive Learning (DiGCL). In this paper, we present the first con

Tong Zekun 28 Jan 08, 2023
Breaking the Curse of Space Explosion: Towards Efficient NAS with Curriculum Search

Breaking the Curse of Space Explosion: Towards Effcient NAS with Curriculum Search Pytorch implementation for "Breaking the Curse of Space Explosion:

guoyong 17 Jan 03, 2023
Implementation of paper "DeepTag: A General Framework for Fiducial Marker Design and Detection"

Implementation of paper DeepTag: A General Framework for Fiducial Marker Design and Detection. Project page: https://herohuyongtao.github.io/research/

Yongtao Hu 46 Dec 12, 2022
Segmentation-Aware Convolutional Networks Using Local Attention Masks

Segmentation-Aware Convolutional Networks Using Local Attention Masks [Project Page] [Paper] Segmentation-aware convolution filters are invariant to b

144 Jun 29, 2022
Plato: A New Framework for Federated Learning Research

a new software framework to facilitate scalable federated learning research.

System <a href=[email protected] Lab"> 192 Jan 05, 2023
Pytorch implementation of ProjectedGAN

ProjectedGAN-pytorch Pytorch implementation of ProjectedGAN (https://arxiv.org/abs/2111.01007) Note: this repository is still under developement. @InP

Dominic Rampas 17 Dec 14, 2022
Neural Magic Eye: Learning to See and Understand the Scene Behind an Autostereogram, arXiv:2012.15692.

Neural Magic Eye Preprint | Project Page | Colab Runtime Official PyTorch implementation of the preprint paper "NeuralMagicEye: Learning to See and Un

Zhengxia Zou 56 Jul 15, 2022
Pytorch implementation of "Forward Thinking: Building and Training Neural Networks One Layer at a Time"

forward-thinking-pytorch Pytorch implementation of Forward Thinking: Building and Training Neural Networks One Layer at a Time Requirements Python 2.7

Kim Heecheol 65 Oct 06, 2022
Christmas face app for Decathlon xmas coding party!

Christmas Face Application Use this library to create the perfect picture for your christmas cards! Done by Hasib Zunair, Guillaume Brassard and Samue

Hasib Zunair 4 Dec 20, 2021
Automatic meme generation model using Tensorflow Keras.

Memefly You can find the project at MemeflyAI. Contributors Nick Buukhalter Harsh Desai Han Lee Project Overview Trello Board Product Canvas Automatic

BloomTech Labs 2 Jan 13, 2022