Ludwig Benchmarking Toolkit

Overview

Ludwig Benchmarking Toolkit

The Ludwig Benchmarking Toolkit is a personalized benchmarking toolkit for running end-to-end benchmark studies across an extensible set of tasks, deep learning models, standard datasets and evaluation metrics.

Getting set-up

To get started, use the following commands to set-up your conda environment.

git clone https://github.com/HazyResearch/ludwig-benchmarking-toolkit.git
cd ludwig-benchmarking-toolkit
conda env create -f environments/{environment-osx.yaml, environment-linux.yaml}
conda activate lbt

Relevant files and directories

experiment-templates/task_template.yaml: Every task (i.e. text classification) will have its owns task template. The template specifies the model architecture (encoder and decoder structure), training parameters, and a hyperopt configuration for the task at hand. A large majority of the values of the template will be populated by the values in the hyperopt_config.yaml file and dataset_metadata.yaml at training time. The sample task template located in experiment-templates/task_template.yaml is for text classification. See sample-task-templates/ for other examples.

experiment-templates/hyperopt_config.yaml: provides a range of values for training parameters and hyperopt params that will populate the hyperopt configuration in the model template

experiment-templates/dataset_metadata.yaml: contains list of all available datasets (and associated metadata) that the hyperparameter optimization can be performed over.

model-configs/: contains all encoder specific yaml files. Each files specifies possible values for relevant encoder parameters that will be optimized over. Each file in this directory adheres to the naming convention {encoder_name}_hyperopt.yaml

hyperopt-experiment-configs/: houses all experiment configs built from the templates specified above (note: this folder will be populated at run-time) and will be used when the hyperopt experiment is called. At a high level, each config file specifies the training and hyperopt information for a (task, dataset, architecture) combination. An example might be (text classification, SST2, BERT)

elasticsearch_config.yaml : this is an optional file that is to be defined if an experiment data will be saved to an elastic database.

USAGE

Command-Line Usage

Running your first TOY experiment:

For testing/setup purposes we have included a toy dataset called toy_agnews. This dataset contains a small set of training, test and validation samples from the original agnews dataset.

Before running a full-scale experiment, we recommend running an experiment locally on the toy dataset:

python experiment_driver.py --run_environment local --datasets toy_agnews --custom_models_list rnn

Running your first REAL experiment:

Steps for configuring + running an experiment:

  1. Declare and configure the search space of all non-model specific training and preprocessing hyperparameters in the experiment-templates/hyperopt_config.yaml file. The parameters specified in this file will be used across all model experiments.

  2. Declare and configure the search space of model specific hyperparameters in the {encoder}_hyperopt.yaml files in ./model_configs

    NOTE:

    • for both (1) and (2) see the Ludwig Hyperparamter Optimization guide to see what parameters for training, preprocessing, and input/ouput features can be used in the hyperopt search
    • if the exectuor type is Ray the list of available search spaces and input format differs slightly than the built-in ludwig types. Please see the Ray Tune search space docs for more information.
  3. Run the following command specifying the datasets, encoders, path to elastic DB index config file, run environment and more:

        python experiment_driver.py \
            --experiment_output_dir  
         
          
            --run_environment {local, gcp}
            --elasticsearch_config 
          
           
            --dataset_cache_dir 
           
            
            --custom_model_list 
            
             
            --datasets 
             
               --resume_existing_exp bool 
             
            
           
          
         

NOTE: Please use python experiment_driver.py -h to see list of available datasets, encoders and args

API Usage

It is also possible to run, customize and experiments using LBTs APIs. In the following section, we describe the three flavors of APIs included in LBT.

experiment API

This API provides an alternative method for running experiments. Note that runnin experiments via the API still requires populating the aforemented configuration files

from lbt.experiments import experiment

experiment(
    models = ['rnn', 'bert'],
    datasets = ['agnews'],
    run_environment = "local",
    elastic_search_config = None,
    resume_existing_exp = False,
)

tools API

This API provides access to two tooling integrations (TextAttack and Robustness Gym (RG)). The TextAttack API can be used to generate adversarial attacks. Moreover, users can use the TextAttack interface to augment data files. The RG API which empowers users to inspect model performance on a set of generic, pre-built slices and to add more slices for their specific datasets and use cases.

from lbt.tools.robustnessgym import RG 
from lbt.tools.textattack import attack, augment

# Robustness Gym API Usage
RG( dataset_name="AGNews",
    models=["bert", "rnn"],
    path_to_dataset="agnews.csv", 
    subpopulations=[ "entities", "positive_words", "negative_words"]))

# TextAttack API Usage
attack(dataset_name="AGNews", path_to_model="agnews/model/rnn_model",
    path_to_dataset="agnews.csv", attack_recipe=["CharSwapAugmenter"])

augment(dataset_name="AGNews", transformations_per_example=1
   path_to_dataset="agnews.csv", augmenter=["WordNetAugmenter"])

visualizations API

This API provides out-of-the-box support for visualizations for learning behavior, model performance, and hyperparameter optimization using the training and evaluation statistics generated during model training

import lbt.visualizations

# compare model performance
compare_performance_viz(
    dataset_name="toy_agnews",
    model_name="rnn",
    output_feature_name="class_index",
)

# compare training and validation trajectory
learning_curves_viz(
    dataset_name="toy_agnews",
    model_name="rnn",
    output_feature_name="class_index",
)

# visualize hyperoptimzation search
hyperopt_viz(
    dataset_name="toy_agnews",
    model_name="rnn",
    output_dir="."
)

EXPERIMENT EXTENSIBILITY

Adding new custom datasets

Adding custom dataset requires creating a new LBTDataset class and adding it to the dataset registry. Creating an LBTDataset object requires implementing three class methods: download, process and load. Please see the the ToyAGNews dataset as an example.

Adding new metrics

Adding custom evaluation metrics requires creating a new LBTMetric class and adding it to the metrics registry. Creating an LBTMetric object requires implementing the run class method which takes as potential inputs a path to a model directory, path to a dataset, training batch size, and training statistics. Please see the pre-built LBT metrics for examples.

ELASTICSEARCH RESEARCH DATABASE

To get credentials to upload experiments to the shared Elasticsearch research database, please fill out this form.

Owner
HazyResearch
We are a CS research group led by Prof. Chris Ré.
HazyResearch
A coin flip game in which you can put the amount of money below or equal to 1000 and then choose heads or tail

COIN_FLIPPY ##This is a simple example package. You can use Github-flavored Markdown to write your content. Coinflippy A coin flip game in which you c

2 Dec 26, 2021
This repository contains the files for running the Patchify GUI.

Repository Name Train-Test-Validation-Dataset-Generation App Name Patchify Description This app is designed for crop images and creating smal

Salar Ghaffarian 9 Feb 15, 2022
HiFi++: a Unified Framework for Neural Vocoding, Bandwidth Extension and Speech Enhancement

HiFi++ : a Unified Framework for Neural Vocoding, Bandwidth Extension and Speech Enhancement This is the unofficial implementation of Vocoder part of

Rishikesh (ऋषिकेश) 118 Dec 29, 2022
Python scripts performing class agnostic object localization using the Object Localization Network model in ONNX.

ONNX Object Localization Network Python scripts performing class agnostic object localization using the Object Localization Network model in ONNX. Ori

Ibai Gorordo 15 Oct 14, 2022
Official repository for "On Improving Adversarial Transferability of Vision Transformers" (2021)

Improving-Adversarial-Transferability-of-Vision-Transformers Muzammal Naseer, Kanchana Ranasinghe, Salman Khan, Fahad Khan, Fatih Porikli arxiv link A

Muzammal Naseer 47 Dec 02, 2022
A hifiasm fork for metagenome assembly using Hifi reads.

hifiasm_meta - de novo metagenome assembler, based on hifiasm, a haplotype-resolved de novo assembler for PacBio Hifi reads.

44 Jul 10, 2022
AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition

AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition [ArXiv] [Project Page] This repository is the official implementation of AdaMML:

International Business Machines 43 Dec 26, 2022
A PyTorch implementation of NeRF (Neural Radiance Fields) that reproduces the results.

NeRF-pytorch NeRF (Neural Radiance Fields) is a method that achieves state-of-the-art results for synthesizing novel views of complex scenes. Here are

Yen-Chen Lin 3.2k Jan 08, 2023
A computational block to solve entity alignment over textual attributes in a knowledge graph creation pipeline.

How to apply? Create your config.ini file following the example provided in config.ini Choose one of the options below to run: Run with Python3 pip in

Scientific Data Management Group 3 Jun 23, 2022
Official Pytorch Implementation for Splicing ViT Features for Semantic Appearance Transfer presenting Splice

Splicing ViT Features for Semantic Appearance Transfer [Project Page] Splice is a method for semantic appearance transfer, as described in Splicing Vi

Omer Bar Tal 253 Jan 06, 2023
Classification of EEG data using Deep Learning

Graduation-Project Classification of EEG data using Deep Learning Epilepsy is the most common neurological disease in the world. Epilepsy occurs as a

Osman Alpaydın 5 Jun 24, 2022
Code for the paper SphereRPN: Learning Spheres for High-Quality Region Proposals on 3D Point Clouds Object Detection, ICIP 2021.

SphereRPN Code for the paper SphereRPN: Learning Spheres for High-Quality Region Proposals on 3D Point Clouds Object Detection, ICIP 2021. Authors: Th

Thang Vu 15 Dec 02, 2022
A list of Machine Learning Art Colabs

ML Visual Art Colabs A list of cool Colabs on Machine Learning Imagemaking or other artistic purposes 3D Ken Burns Effect Ken Burns Effect by Manuel R

Derrick Schultz (he/him) 789 Dec 12, 2022
Homepage of paper: Paint Transformer: Feed Forward Neural Painting with Stroke Prediction, ICCV 2021.

Paint Transformer: Feed Forward Neural Painting with Stroke Prediction [Paper] [PaddlePaddle Implementation] Homepage of paper: Paint Transformer: Fee

442 Dec 16, 2022
Python3 Implementation of (Subspace Constrained) Mean Shift Algorithm in Euclidean and Directional Product Spaces

(Subspace Constrained) Mean Shift Algorithms in Euclidean and/or Directional Product Spaces This repository contains Python3 code for the mean shift a

Yikun Zhang 0 Oct 19, 2021
Video-face-extractor - Video face extractor with Python

Python face extractor Setup Create the srcvideos and faces directories Put your

2 Feb 03, 2022
RM Operation can equivalently convert ResNet to VGG, which is better for pruning; and can help RepVGG perform better when the depth is large.

RM Operation can equivalently convert ResNet to VGG, which is better for pruning; and can help RepVGG perform better when the depth is large.

184 Jan 04, 2023
Exploit ILP to learn symmetry breaking constraints of ASP programs.

ILP Symmetry Breaking Overview This project aims to exploit inductive logic programming to lift symmetry breaking constraints of ASP programs. Given a

Research Group Production Systems 1 Apr 13, 2022
Paaster is a secure by default end-to-end encrypted pastebin built with the objective of simplicity.

Follow the development of our desktop client here Paaster Paaster is a secure by default end-to-end encrypted pastebin built with the objective of sim

Ward 211 Dec 25, 2022
The dataset and source code for our paper: "Did You Ask a Good Question? A Cross-Domain Question IntentionClassification Benchmark for Text-to-SQL"

TriageSQL The dataset and source code for our paper: "Did You Ask a Good Question? A Cross-Domain Question Intention Classification Benchmark for Text

Yusen Zhang 22 Nov 09, 2022