Torch-based tool for quantizing high-dimensional vectors using additive codebooks

Last update: Jan 07, 2023

Related tags

Overview

Trainable multi-codebook quantization

This repository implements a utility for use with PyTorch, and ideally GPUs, for training an efficient quantizer based on multiple single-byte codebooks. The prototypical scenario is that you have some distribution over vectors in some space, say, of dimension 512, that might come from a neural net embedding, and you want a means of encoding a vector into a short sequence of bytes (say, 4 or 8 bytes) that can be used to reconstruct the vector with minimal expected loss, measured as squared distance, i.e. squared l2 loss.

This repository provides Quantizer object that lets you do this quantization, and an associated QuantizerTrainer object that you can use to train the Quantizer. For example, you might invoke the QuantizerTrainer with 20,000 minibatches of vectors.

Usage

Installation

python3 setup.py install

Example

import torch
import quantization

trainer = quantization.QuantizerTrainer(dim=256, bytes_per_frame=4,
                                        device=torch.device('cuda'))
while not trainer.done():
   # let x be some tensor of shape (*, dim), that you will train on
   # (should not be the same on each minibatch)
   trainer.step(x)
quantizer = trainer.get_quantizer()

# let x be some tensor of shape (*, dim)..
encoded = quantizer.encode(x)  # (*, 4), dtype=uint8
x_approx = quantizer.decode(quantizer.encode(x))

To avoid versioning issues and so on, it may be easier to just include quantization.py in your repository directly (and add its requirements to your requirements.txt).

Torch-based tool for quantizing high-dimensional vectors using additive codebooks

Related tags

Overview

Trainable multi-codebook quantization

Usage

Installation

Example

Owner

Daniel Povey

PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D World [ACL 2021]

Deep Inside Convolutional Networks - This is a caffe implementation to visualize the learnt model

The codes and related files to reproduce the results for Image Similarity Challenge Track 2.

JAXDL: JAX (Flax) Deep Learning Library

Complete system for facial identity system

BED: A Real-Time Object Detection System for Edge Devices

All public open-source implementations of convnets benchmarks

The challenge for Quantum Coalition Hackathon 2021

Implementation of TabTransformer, attention network for tabular data, in Pytorch

A Sign Language detection project using Mediapipe landmark detection and Tensorflow LSTM's

Progressive Growing of GANs for Improved Quality, Stability, and Variation

Social Network Ads Prediction

Sign Language Translation with Transformers (COLING'2020, ECCV'20 SLRTP Workshop)

A collection of awesome resources image-to-image translation.

This is the official PyTorch implementation of the CVPR 2020 paper "TransMoMo: Invariance-Driven Unsupervised Video Motion Retargeting".

Offical implementation for "Trash or Treasure? An Interactive Dual-Stream Strategy for Single Image Reflection Separation".

Direct design of biquad filter cascades with deep learning by sampling random polynomials.

Home repository for the Regularized Greedy Forest (RGF) library. It includes original implementation from the paper and multithreaded one written in C++, along with various language-specific wrappers.

Turning pixels into virtual points for multimodal 3D object detection.

Let Python optimize the best stop loss and take profits for your TradingView strategy.