Focus on Algorithm Design, Not on Data Wrangling

Overview

The visual data management platform from Zensors.


Join for free at app.datatap.dev.

The dataTap Python library is the primary interface for using dataTap's rich data management tools. Create datasets, stream annotations, and analyze model performance all with one library.


Documentation

Full documentation is available at docs.datatap.dev.

Features

  • Begin training instantly
  • 🔥 Works with all major ML frameworks (Pytorch, TensorFlow, etc.)
  • 🛰️ Real-time streaming to avoid large dataset downloads
  • 🌐 Universal data format for simple data exchange
  • 🎨 Combine data from multiples sources into a single dataset easily
  • 🧮 Rich ML utilities to compute PR-curves, confusion matrices, and accuracy metrics.
  • 💽 Free access to a variety of open datasets.

Getting Started (Platform)

To begin, select a dataset from the dataTap repository.

Then copy the starter code based on your library preference.

Paste the starter code and start training.

Getting Started (API)

Install the client library.

pip install datatap

Register at app.datatap.dev. Then, go to Settings > Api Keys to find your personal API key.

export DATATAP_API_KEY="XXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXX"

Start using open datasets instantly.

from datatap import Api

api = Api()
coco = api.get_default_database().get_repository("_/coco")
dataset = coco.get_dataset("latest")
print("COCO: ", dataset)

Data Streaming Example

import itertools
from datatap import Api

api = Api()
dataset = (api
    .get_default_database()
    .get_repository("_/wider-person")
    .get_dataset("latest")
)

training_stream = dataset_version.stream_split("training")
for annotation in itertools.islice(training_stream, 5):
    print("Received annotation:", annotation)

More Examples

Support and FAQ

Q. How do I resolve a missing API Key?

If you see the error Exception: No API key available. Either provide it or use the [DATATAP_API_KEY] environment variable, then the dataTap library was not able to find your API key. You can find your API key on app.datatap.dev under settings. You can either set it as an environment variable or as the first argument to the Api constructor.

Q. Can dataTap be used offline?

Some functionality can be used offline, such as the droplet utilities and metrics. However, repository access and dataset streaming require internet access, even for local databases.

Q. Is dataTap accepting contributions?

dataTap currently uses a separate code review system for managing contributions. The team is looking into switching that system to GitHub to allow public contributions. Until then, we will actively monitor the GitHub issue tracker to help accomodate the community's needs.

Q. How can I get help using dataTap?

You can post a question in the issue tracker. The dataTap team actively monitors the repository, and will try to get back to you as soon as possible.

Handout for the tutorial "Creating publication-quality figures with matplotlib"

Handout for the tutorial "Creating publication-quality figures with matplotlib"

JB Mouret 1.9k Jan 02, 2023
Cartopy - a cartographic python library with matplotlib support

Cartopy is a Python package designed to make drawing maps for data analysis and visualisation easy. Table of contents Overview Get in touch License an

1.2k Jan 01, 2023
Splore - a simple graphical interface for scrolling through and exploring data sets of molecules

Scroll through and exPLORE molecule sets The splore framework aims to offer a si

3 Jun 18, 2022
Scientific measurement library for instruments, experiments, and live-plotting

PyMeasure scientific package PyMeasure makes scientific measurements easy to set up and run. The package contains a repository of instrument classes a

PyMeasure 445 Jan 04, 2023
a python function to plot a geopandas dataframe

Pretty GeoDataFrame A minimum python function (~60 lines) to draw pretty geodataframe. Based on matplotlib, shapely, descartes. Installation just use

haoming 27 Dec 05, 2022
Political elections, appointment, analysis and visualization in Python

Political elections, appointment, analysis and visualization in Python poli-sci-kit is a Python package for political science appointment and election

Andrew Tavis McAllister 9 Dec 01, 2022
nvitop, an interactive NVIDIA-GPU process viewer, the one-stop solution for GPU process management

An interactive NVIDIA-GPU process viewer, the one-stop solution for GPU process management.

Xuehai Pan 1.3k Jan 02, 2023
Eulera Dashboard is an easy and intuitive way to get a quick feel of what’s happening on the world’s market.

an easy and intuitive way to get a quick feel of what’s happening on the world’s market ! Eulera dashboard is a tool allows you to monitor historical

Salah Eddine LABIAD 4 Nov 25, 2022
Active Transport Analytics Model (ATAM) is a new strategic transport modelling and data visualization framework for Active Transport as well as emerging micro-mobility modes

{ATAM} Active Transport Analytics Model Active Transport Analytics Model (“ATAM”) is a new strategic transport modelling and data visualization framew

Peter Stephan 0 Jan 12, 2022
Data science project for exploratory analysis on the kcse grades dataset (Kamilimu Data Science Track)

Kcse-Data-Analysis Data science project for exploratory analysis on the kcse grades dataset (Kamilimu Data Science Track) Findings The performance of

MUGO BRIAN 1 Feb 23, 2022
Gaphas is the diagramming widget library for Python.

Gaphas Gaphas is the diagramming widget library for Python. Gaphas is a library that provides the user interface component (widget) for drawing diagra

Gaphor 144 Dec 14, 2022
DataVisualization - The evolution of my arduino and python journey. New level of competence achieved

DataVisualization - The evolution of my arduino and python journey. New level of competence achieved

1 Jan 03, 2022
Bioinformatics tool for exploring RNA-Protein interactions

Explore RNA-Protein interactions. RNPFind is a bioinformatics tool. It takes an RNA transcript as input and gives a list of RNA binding protein (RBP)

Nahin Khan 3 Jan 27, 2022
Simple Python interface for Graphviz

Simple Python interface for Graphviz

Sebastian Bank 1.3k Dec 26, 2022
A simple, fast, extensible python library for data validation.

Validr A simple, fast, extensible python library for data validation. Simple and readable schema 10X faster than jsonschema, 40X faster than schematic

kk 209 Sep 19, 2022
Show Data: Show your dataset in web browser!

Show Data is to generate html tables for large scale image dataset, especially for the dataset in remote server. It provides some useful commond line tools and fully customizeble API reference to gen

Dechao Meng 83 Nov 26, 2022
Generate knowledge graphs with interesting geometries, like lattices

Geometric Graphs Generate knowledge graphs with interesting geometries, like lattices. Works on Python 3.9+ because it uses cool new features. Get out

Charles Tapley Hoyt 5 Jan 03, 2022
Simple and lightweight Spotify Overlay written in Python.

Simple Spotify Overlay This is a simple yet powerful Spotify Overlay. About I have been looking for something like this ever since I got Spotify. I th

27 Sep 03, 2022
CONTRIBUTIONS ONLY: Voluptuous, despite the name, is a Python data validation library.

CONTRIBUTIONS ONLY What does this mean? I do not have time to fix issues myself. The only way fixes or new features will be added is by people submitt

Alec Thomas 1.8k Dec 31, 2022
Dipto Chakrabarty 7 Sep 06, 2022