Asterisk*

Generating Training Data made Easy

Asterisk is a framework to generate high-quality training datasets at scale. Instead of relying on the end users to write user-defined heuristics, the proposed approach exploits a small set of labeled data and automatically produces a set of heuristics to assign initial labels. In order to enhance the quality of the generated labels, the framework improves the accuracies of the heuristics by applying a novel data-driven AL process. During the process, the system examines the generated weak labels along with the modeled accuracies of the heuristics to help the learner decide on the points for which the user should provide true labels.

Installation

To install Asterisk, you can use pip:

pip install asterisk

or clone the Git repository and run:

pip install -e .

within it.

Publications

M. Nashaat, A. Ghosh, J. Miller, and S. Quader, “Asterisk: Generating Large Training Datasets with Automatic Active Supervision,” ACM Transactions on Data Science (TDS), May 2020.
M. Nashaat, A. Ghosh, J. Miller, and S. Quader, "WeSAL: Applying Active Supervision to Find High-quality Labels at Industrial Scale", Proceedings of the 53rd Hawaii International Conference on System Sciences, HI, USA, 2020, pp. 219-228.
M. Nashaat, A. Ghosh, J. Miller, S. Quader, C. Marston and J. Puget, "Hybridization of Active Learning and Data Programming for Labeling Large Industrial Datasets," 2018 IEEE International Conference on Big Data (Big Data) , Seattle, WA, USA, 2018, pp. 46-55. doi: 10.1109/BigData.2018.8622459.

Asterisk is a framework to generate high-quality training datasets at scale

Related tags

Overview

Asterisk*

Installation

Publications

Owner

Mona Nashaat

Unofficial pytorch implementation of 'Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization'

ImageNet-CoG is a benchmark for concept generalization. It provides a full evaluation framework for pre-trained visual representations which measure how well they generalize to unseen concepts.

Only works with the dashboard version / branch of jesse

Generative Models as a Data Source for Multiview Representation Learning

HW3 ― GAN, ACGAN and UDA

Unsupervised Video Interpolation using Cycle Consistency

Python library for tracking human heads with FLAME (a 3D morphable head model)

KoRean based ELECTRA pre-trained models (KR-ELECTRA) for Tensorflow and PyTorch

MMdet2-based reposity about lightweight detection model: Nanodet, PicoDet.

Official Implementation and Dataset of "PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask and Group-Level Consistency", CVPR 2021

PASSL包含 SimCLR，MoCo，BYOL，CLIP等基于对比学习的图像自监督算法以及 Vision-Transformer，Swin-Transformer，BEiT，CVT，T2T，MLP_Mixer等视觉Transformer算法

A simple pygame dino game which can also be trained and played by a NEAT KI

A PyTorch Implementation of PGL-SUM from "Combining Global and Local Attention with Positional Encoding for Video Summarization", Proc. IEEE ISM 2021

SAN for Product Attributes Prediction

Object Detection Projekt in GKI WS2021/22

MapReader: A computer vision pipeline for the semantic exploration of maps at scale

MOT-Tracking-by-Detection-Pipeline - For Tracking-by-Detection format MOT (Multi Object Tracking), is it a framework that separates Detection and Tracking processes?

Machine Translation Implement By Bi-GRU And Transformer

Official Implementation for Fast Training of Neural Lumigraph Representations using Meta Learning.

A lightweight library to compare different PyTorch implementations of the same network architecture.