Label Studio is a multi-type data labeling and annotation tool with standardized output format

Overview

GitHub label-studio:build GitHub release

WebsiteDocsTwitterJoin Slack Community

What is Label Studio?

Label Studio is an open source data labeling tool. It lets you label data types like audio, text, images, videos, and time series with a simple and straightforward UI and export to various model formats. It can be used to prepare raw data or improve existing training data to get more accurate ML models.

Gif of Label Studio annotating different types of data

Have a custom dataset? You can customize Label Studio to fit your needs. Read an introductory blog post to learn more.

Try out Label Studio

Install Label Studio locally, or deploy it in a cloud instance. Also you can try Label Studio Teams.

Install locally with Docker

Official Label Studio docker image is here and it can be downloaded with docker pull. Run Label Studio in a Docker container and access it at http://localhost:8080.

docker pull heartexlabs/label-studio:latest
docker run -it -p 8080:8080 -v `pwd`/mydata:/label-studio/data heartexlabs/label-studio:latest

You can find all the generated assets, including SQLite3 database storage label_studio.sqlite3 and uploaded files, in the ./mydata directory.

Override default Docker install

You can override the default launch command by appending the new arguments:

docker run -it -p 8080:8080 -v `pwd`/mydata:/label-studio/data heartexlabs/label-studio:latest label-studio --log-level DEBUG

Build a local image with Docker

If you want to build a local image, run:

docker build -t heartexlabs/label-studio:latest .

Run with Docker Compose

Docker compose script provides production-ready stack consisting of the following components:

  • Label Studio
  • Nginx - proxy web server used to load various static data, including uploaded audio, images, etc.
  • PostgreSQL - production-ready database that replaces less performant SQLite3.

To start using the app from http://localhost run this command:

docker-compose up

Install locally with pip

# Requires >=Python3.6, <3.9
pip install label-studio

# Start the server at http://localhost:8080
label-studio

Install locally with Anaconda

conda create --name label-studio python=3.8
conda activate label-studio
pip install label-studio

Install for local development

You can run the latest Label Studio version locally without installing the package with pip.

# Install all package dependencies
pip install -e .
# Run database migrations
python label_studio/manage.py migrate
# Start the server in development mode at http://localhost:8080
python label_studio/manage.py runserver

Deploy in a cloud instance

You can deploy Label Studio with one click in Heroku, Microsoft Azure, or Google Cloud Platform:

Apply frontend changes

The frontend part of Label Studio app lies in the frontend/ folder and written in React JSX. In case you've made some changes there, the following commands should be run before building / starting the instance:

cd label_studio/frontend/
npm ci
npx webpack
cd ../..
python label_studio/manage.py collectstatic --no-input

Troubleshoot installation

If you see any errors during installation, try to rerun the installation

pip install --ignore-installed label-studio

Install dependencies on Windows

To run Label Studio on Windows, download and install the following wheel packages from Gohlke builds to ensure you're using the correct version of Python:

# Upgrade pip 
pip install -U pip

# If you're running Win64 with Python 3.8, install the packages downloaded from Gohlke:
pip install lxml‑4.5.0‑cp38‑cp38‑win_amd64.whl

# Install label studio
pip install label-studio

What you get from Label Studio

Screenshot of Label Studio data manager grid view with images

  • Multi-user labeling sign up and login, when you create an annotation it's tied to your account.
  • Multiple projects to work on all your datasets in one instance.
  • Streamlined design helps you focus on your task, not how to use the software.
  • Configurable label formats let you customize the visual interface to meet your specific labeling needs.
  • Support for multiple data types including images, audio, text, HTML, time-series, and video.
  • Import from files or from cloud storage in Amazon AWS S3, Google Cloud Storage, or JSON, CSV, TSV, RAR, and ZIP archives.
  • Integration with machine learning models so that you can visualize and compare predictions from different models and perform pre-labeling.
  • Embed it in your data pipeline REST API makes it easy to make it a part of your pipeline

Included templates for labeling data in Label Studio

Label Studio includes a variety of templates to help you label your data, or you can create your own using specifically designed configuration language. The most common templates and use cases for labeling include the following cases:

Set up machine learning models with Label Studio

Connect your favorite machine learning model using the Label Studio Machine Learning SDK. Follow these steps:

  1. Start your own machine learning backend server. See more detailed instructions.
  2. Connect Label Studio to the server on the model page found in project settings.

This lets you:

  • Pre-label your data using model predictions.
  • Do online learning and retrain your model while new annotations are being created.
  • Do active learning by labeling only the most complex examples in your data.

Integrate Label Studio with your existing tools

You can use Label Studio as an independent part of your machine learning workflow or integrate the frontend or backend into your existing tools.

Ecosystem

Project Description
label-studio Server, distributed as a pip package
label-studio-frontend React and JavaScript frontend and can run standalone in a web browser or be embedded into your application.
data-manager React and JavaScript frontend for managing data. Includes the Label Studio Frontend. Relies on the label-studio server or a custom backend with the expected API methods.
label-studio-converter Encode labels in the format of your favorite machine learning library
label-studio-transformers Transformers library connected and configured for use with Label Studio

Roadmap

Want to use The Coolest Feature X but Label Studio doesn't support it? Check out our public roadmap!

Citation

@misc{Label Studio,
  title={{Label Studio}: Data labeling software},
  url={https://github.com/heartexlabs/label-studio},
  note={Open source software available from https://github.com/heartexlabs/label-studio},
  author={
    Maxim Tkachenko and
    Mikhail Malyuk and
    Nikita Shevchenko and
    Andrey Holmanyuk and
    Nikolai Liubimov},
  year={2020-2021},
}

License

This software is licensed under the Apache 2.0 LICENSE © Heartex. 2020-2021

Owner
Heartex
Data labeling and exploration tools for Machine Learning
Heartex
Wandb-predictions - WANDB Predictions With Python

WANDB API CI/CD Below we capture the CI/CD scenarios that we would expect with o

Anish Shah 6 Oct 07, 2022
DiffStride: Learning strides in convolutional neural networks

DiffStride is a pooling layer with learnable strides. Unlike strided convolutions, average pooling or max-pooling that require cross-validating stride values at each layer, DiffStride can be initiali

Google Research 113 Dec 13, 2022
Estimating Example Difficulty using Variance of Gradients

Estimating Example Difficulty using Variance of Gradients This repository contains source code necessary to reproduce some of the main results in the

Chirag Agarwal 48 Dec 26, 2022
Repository of Vision Transformer with Deformable Attention

Vision Transformer with Deformable Attention This repository contains the code for the paper Vision Transformer with Deformable Attention [arXiv]. Int

410 Jan 03, 2023
This is the official implementation of the paper "Object Propagation via Inter-Frame Attentions for Temporally Stable Video Instance Segmentation".

[CVPRW 2021] - Object Propagation via Inter-Frame Attentions for Temporally Stable Video Instance Segmentation

Anirudh S Chakravarthy 6 May 03, 2022
Deep generative models of 3D grids for structure-based drug discovery

What is liGAN? liGAN is a research codebase for training and evaluating deep generative models for de novo drug design based on 3D atomic density grid

Matt Ragoza 152 Jan 03, 2023
RITA is a family of autoregressive protein models, developed by LightOn in collaboration with the OATML group at Oxford and the Debora Marks Lab at Harvard.

RITA: a Study on Scaling Up Generative Protein Sequence Models RITA is a family of autoregressive protein models, developed by a collaboration of Ligh

LightOn 69 Dec 22, 2022
OpenABC-D: A Large-Scale Dataset For Machine Learning Guided Integrated Circuit Synthesis

OpenABC-D: A Large-Scale Dataset For Machine Learning Guided Integrated Circuit Synthesis Overview OpenABC-D is a large-scale labeled dataset generate

NYU Machine-Learning guided Design Automation (MLDA) 31 Nov 22, 2022
code for "Feature Importance-aware Transferable Adversarial Attacks"

Feature Importance-aware Attack(FIA) This repository contains the code for the paper: Feature Importance-aware Transferable Adversarial Attacks (ICCV

Hengchang Guo 44 Nov 24, 2022
A complete end-to-end demonstration in which we collect training data in Unity and use that data to train a deep neural network to predict the pose of a cube. This model is then deployed in a simulated robotic pick-and-place task.

Object Pose Estimation Demo This tutorial will go through the steps necessary to perform pose estimation with a UR3 robotic arm in Unity. You’ll gain

Unity Technologies 187 Dec 24, 2022
Machine Learning Toolkit for Kubernetes

Kubeflow the cloud-native platform for machine learning operations - pipelines, training and deployment. Documentation Please refer to the official do

Kubeflow 12.1k Jan 03, 2023
Trained on Simulated Data, Tested in the Real World

Trained on Simulated Data, Tested in the Real World

livox 43 Nov 18, 2022
ZSL-KG is a general-purpose zero-shot learning framework with a novel transformer graph convolutional network (TrGCN) to learn class representation from common sense knowledge graphs.

ZSL-KG is a general-purpose zero-shot learning framework with a novel transformer graph convolutional network (TrGCN) to learn class representa

Bats Research 94 Nov 21, 2022
ImageNet-CoG is a benchmark for concept generalization. It provides a full evaluation framework for pre-trained visual representations which measure how well they generalize to unseen concepts.

The ImageNet-CoG Benchmark Project Website Paper (arXiv) Code repository for the ImageNet-CoG Benchmark introduced in the paper "Concept Generalizatio

NAVER 23 Oct 09, 2022
Improving Query Representations for DenseRetrieval with Pseudo Relevance Feedback:A Reproducibility Study.

APR The repo for the paper Improving Query Representations for DenseRetrieval with Pseudo Relevance Feedback:A Reproducibility Study. Environment setu

ielab 8 Nov 26, 2022
Official pytorch implementation of "Feature Stylization and Domain-aware Contrastive Loss for Domain Generalization" ACMMM 2021 (Oral)

Feature Stylization and Domain-aware Contrastive Loss for Domain Generalization This is an official implementation of "Feature Stylization and Domain-

22 Sep 22, 2022
Vector Quantized Diffusion Model for Text-to-Image Synthesis

Vector Quantized Diffusion Model for Text-to-Image Synthesis Due to company policy, I have to set microsoft/VQ-Diffusion to private for now, so I prov

Shuyang Gu 294 Jan 05, 2023
You Only Hypothesize Once: Point Cloud Registration with Rotation-equivariant Descriptors

You Only Hypothesize Once: Point Cloud Registration with Rotation-equivariant Descriptors In this paper, we propose a novel local descriptor-based fra

Haiping Wang 80 Dec 15, 2022
Intro-to-dl - Resources for "Introduction to Deep Learning" course.

Introduction to Deep Learning course resources https://www.coursera.org/learn/intro-to-deep-learning Running on Google Colab (tested for all weeks) Go

Advanced Machine Learning specialisation by HSE 761 Dec 24, 2022
Python script that allows you to automatically setup your Growtopia server.

AutoSetup Python script that allows you to automatically setup your Growtopia server. How To Use Firstly, install all the required modules that used i

Aspire 3 Mar 06, 2022