Lingvo

What is it?

Lingvo is a framework for building neural networks in Tensorflow, particularly sequence models.

A list of publications using Lingvo can be found here.

Releases
- Major breaking changes
Quick start
Models
References
License

Releases

PyPI Version	Commit
0.10.0	075fd1d88fa6f92681f58a2383264337d0e737ee
0.9.1	c1124c5aa7af13d2dd2b6d43293c8ca6d022b008
0.9.0	f826e99803d1b51dccbbbed1ef857ba48a2bbefe

Older releases

PyPI Version	Commit
0.8.2	93e123c6788e934e6b7b1fd85770371becf1e92e
0.7.2	b05642fe386ee79e0d88aa083565c9a93428519e

Details for older releases are unavailable.

Major breaking changes

NOTE: this is not a comprehensive list. Lingvo releases do not offer any guarantees regarding backwards compatibility.

HEAD

Nothing here.

0.10.0

General
- The theta_fn arg to CreateVariable() has been removed.

0.9.1

General
- Python 3.9 is now supported.
- ops.beam_search_step now takes and returns an additional arg beam_done.
- The namedtuple beam_search_helper.BeamSearchDecodeOutput now removes the field done_hyps.

0.9.0

General
- Tensorflow 2.5 is now the required version.
- Python 3.5 support has been removed.
- py_utils.AddGlobalVN and py_utils.AddPerStepVN have been combined into py_utils.AddVN.
- BaseSchedule().Value() no longer takes a step arg.
- Classes deriving from BaseSchedule should implement Value() not FProp().
- theta.global_step has been removed in favor of py_utils.GetGlobalStep().
- py_utils.GenerateStepSeedPair() no longer takes a global_step arg.
- PostTrainingStepUpdate() no longer takes a global_step arg.
- The fatal_errors argument to custom input ops now takes error message substrings rather than integer error codes.

Older releases

0.8.2

General
- NestedMap Flatten/Pack/Transform/Filter etc now expand descendent dicts as well.
- Subclasses of BaseLayer extending from abc.ABCMeta should now extend base_layer.ABCLayerMeta instead.
- Trying to call self.CreateChild outside of __init__ now raises an error.
- base_layer.initializer has been removed. Subclasses no longer need to decorate their __init__ function.
- Trying to call self.CreateVariable outside of __init__ or _CreateLayerVariables now raises an error.
- It is no longer possible to access self.vars or self.theta inside of __init__. Refactor by moving the variable creation and access to _CreateLayerVariables. The variable scope is set automatically according to the layer name in _CreateLayerVariables.

Details for older releases are unavailable.

Quick start

Installation

There are two ways to set up Lingvo: installing a fixed version through pip, or cloning the repository and building it with bazel. Docker configurations are provided for each case.

If you would just like to use the framework as-is, it is easiest to just install it through pip. This makes it possible to develop and train custom models using a frozen version of the Lingvo framework. However, it is difficult to modify the framework code or implement new custom ops.

If you would like to develop the framework further and potentially contribute pull requests, you should avoid using pip and clone the repository instead.

pip:

The Lingvo pip package can be installed with pip3 install lingvo.

See the codelab for how to get started with the pip package.

From sources:

The prerequisites are:

a TensorFlow 2.6 installation,
a C++ compiler (only g++ 7.3 is officially supported), and
the bazel build system.

Refer to docker/dev.dockerfile for a set of working requirements.

git clone the repository, then use bazel to build and run targets directly. The python -m module commands in the codelab need to be mapped onto bazel run commands.

docker:

Docker configurations are available for both situations. Instructions can be found in the comments on the top of each file.

lib.dockerfile has the Lingvo pip package preinstalled.
dev.dockerfile can be used to build Lingvo from sources.

How to install docker.

Running the MNIST image model

Preparing the input data

pip:

mkdir -p /tmp/mnist
python3 -m lingvo.tools.keras2ckpt --dataset=mnist

bazel:

mkdir -p /tmp/mnist
bazel run -c opt //lingvo/tools:keras2ckpt -- --dataset=mnist

The following files will be created in /tmp/mnist:

mnist.data-00000-of-00001: 53MB.
mnist.index: 241 bytes.

Running the model

pip:

cd /tmp/mnist
curl -O https://raw.githubusercontent.com/tensorflow/lingvo/master/lingvo/tasks/image/params/mnist.py
python3 -m lingvo.trainer --run_locally=cpu --mode=sync --model=mnist.LeNet5 --logdir=/tmp/mnist/log

bazel:

(cpu) bazel build -c opt //lingvo:trainer
(gpu) bazel build -c opt --config=cuda //lingvo:trainer
bazel-bin/lingvo/trainer --run_locally=cpu --mode=sync --model=image.mnist.LeNet5 --logdir=/tmp/mnist/log --logtostderr

After about 20 seconds, the loss should drop below 0.3 and a checkpoint will be saved, like below. Kill the trainer with Ctrl+C.

trainer.py:518] step:   205, steps/sec: 11.64 ... loss:0.25747201 ...
checkpointer.py:115] Save checkpoint
checkpointer.py:117] Save checkpoint done: /tmp/mnist/log/train/ckpt-00000205

Some artifacts will be produced in /tmp/mnist/log/control:

params.txt: hyper-parameters.
model_analysis.txt: model sizes for each layer.
train.pbtxt: the training tf.GraphDef.
events.*: a tensorboard events file.

As well as in /tmp/mnist/log/train:

checkpoint: a text file containing information about the checkpoint files.
ckpt-*: the checkpoint files.

Now, let's evaluate the model on the "Test" dataset. In the normal training setup the trainer and evaler should be run at the same time as two separate processes.

pip:

python3 -m lingvo.trainer --job=evaler_test --run_locally=cpu --mode=sync --model=mnist.LeNet5 --logdir=/tmp/mnist/log

bazel:

bazel-bin/lingvo/trainer --job=evaler_test --run_locally=cpu --mode=sync --model=image.mnist.LeNet5 --logdir=/tmp/mnist/log --logtostderr

Kill the job with Ctrl+C when it starts waiting for a new checkpoint.

base_runner.py:177] No new check point is found: /tmp/mnist/log/train/ckpt-00000205

The evaluation accuracy can be found slightly earlier in the logs.

base_runner.py:111] eval_test: step:   205, acc5: 0.99775392, accuracy: 0.94150388, ..., loss: 0.20770954, ...

Running the machine translation model

To run a more elaborate model, you'll need a cluster with GPUs. Please refer to third_party/py/lingvo/tasks/mt/README.md for more information.

Running the GShard transformer based giant language model

To train a GShard language model with one trillion parameters on GCP using CloudTPUs v3-512 using 512-way model parallelism, please refer to third_party/py/lingvo/tasks/lm/README.md for more information.

Running the 3d object detection model

To run the StarNet model using CloudTPUs on GCP, please refer to third_party/py/lingvo/tasks/car/README.md.

Models

Automatic Speech Recognition

Listen, Attend and Spell.
William Chan, Navdeep Jaitly, Quoc V. Le, and Oriol Vinyals. ICASSP 2016.

End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results.
Jan Chorowski, Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. arXiv 2014.
- asr.librispeech.Librispeech960Grapheme
- asr.librispeech.Librispeech960Wpm

Car

StarNet: Targeted Computation for Object Detection in Point Clouds.
Jiquan Ngiam, Benjamin Caine, Wei Han, Brandon Yang, Yuning Chai, Pei Sun, Yin Zhou, Xi Yi, Ouais Alsharif, Patrick Nguyen, Zhifeng Chen, Jonathon Shlens, and Vijay Vasudevan. arXiv 2019.

Image

Gradient-based learning applied to document recognition.
Yann LeCun, Leon Bottou, Yoshua Bengio, and Patrick Haffner. IEEE 1998.
- image.mnist.LeNet5

Language Modelling

mt.wmt14_en_de_xendec.WmtEnDeXEnDec⁸
Exploring the Limits of Language Modeling.
Rafal Jozefowicz, Oriol Vinyals, Mike Schuster, Noam Shazeer, and Yonghui Wu. arXiv, 2016.
- lm.one_billion_wds.WordLevelOneBwdsSimpleSampledSoftmax
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding.
Dmitry Lepikhin, HyoukJoong Lee, Yuanzhong Xu, Dehao Chen, Orhan Firat, Yanping Huang, Maxim Krikun, Noam Shazeer and Zhifeng Chen arXiv, 2020.
- lm.synthetic_packed_input.DenseLm1T16x16

Machine Translation

The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation.
Mia X. Chen, Orhan Firat, Ankur Bapna, Melvin Johnson, Wolfgang Macherey, George Foster, Llion Jones, Mike Schuster, Noam Shazeer, Niki Parmar, Ashish Vaswani, Jakob Uszkoreit, Lukasz Kaiser, Zhifeng Chen, Yonghui Wu, and Macduff Hughes. ACL 2018.
Self-supervised and Supervised Joint Training for Resource-rich Neural Machine Translation.
Yong Cheng, Wei Wang, Lu Jiang, and Wolfgang Macherey. ICML 2021.
- mt.wmt14_en_de_xendec.WmtEnDeXEnDec

References

Please cite this paper when referencing Lingvo.

@misc{shen2019lingvo,
    title={Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling},
    author={Jonathan Shen and Patrick Nguyen and Yonghui Wu and Zhifeng Chen and others},
    year={2019},
    eprint={1902.08295},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

License

Apache License 2.0

Lingvo is a framework for building neural networks in Tensorflow, particularly sequence models.

Related tags

Overview

Lingvo

What is it?

Table of Contents

Releases

Major breaking changes

HEAD

0.10.0

0.9.1

0.9.0

0.8.2

Quick start

Installation

Running the MNIST image model

Preparing the input data

Running the model

Running the machine translation model

Running the GShard transformer based giant language model

Running the 3d object detection model

Models

Automatic Speech Recognition

Car

Image

Language Modelling

Machine Translation

References

License

Owner

Weight initialization schemes for PyTorch nn.Modules

This repo includes our code for evaluating and improving transferability in domain generalization (NeurIPS 2021)

A simple implementation of Kalman filter in single object tracking

BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation

Who calls the shots? Rethinking Few-Shot Learning for Audio (WASPAA 2021)

Generate text captions for images from their CLIP embeddings. Includes PyTorch model code and example training script.

A PyTorch implementation of "SelfGNN: Self-supervised Graph Neural Networks without explicit negative sampling"

A Closer Look at Invalid Action Masking in Policy Gradient Algorithms

Code for: https://berkeleyautomation.github.io/bags/

ICNet for Real-Time Semantic Segmentation on High-Resolution Images, ECCV2018

CO-PILOT: COllaborative Planning and reInforcement Learning On sub-Task curriculum

A colab notebook for training Stylegan2-ada on colab, transfer learning onto your own dataset.

IEEE Winter Conference on Applications of Computer Vision 2022 Accepted

Circuit Training: An open-source framework for generating chip floor plans with distributed deep reinforcement learning

PyTorch implementation for "Mining Latent Structures with Contrastive Modality Fusion for Multimedia Recommendation"

Rede Neural Convolucional feita durante o processo seletivo do Laboratório de Inteligência Artificial da FACOM (UFMS)

Processed, version controlled history of Minecraft's generated data and assets

PyTorch implementation for Graph Contrastive Learning with Augmentations

This is the official implementation of Elaborative Rehearsal for Zero-shot Action Recognition (ICCV2021)

Unpaired Caricature Generation with Multiple Exaggerations