Jupyter Notebook extension leveraging pandas DataFrames by integrating DataTables and ChartJS.

Overview

Jupyter DataTables

Jupyter Notebook extension to leverage pandas DataFrames by integrating DataTables JS.


About

Data scientists and in fact many developers work with pd.DataFrame on daily basis to interpret data to process them. In my typical workflow. The common workflow is to display the dataframe, take a look at the data schema and then produce multiple plots to check the distribution of the data to have a clearer picture, perhaps search some data in the table, etc...

What if those distribution plots were part of the standard DataFrame and we had the ability to quickly search through the table with minimal effort? What if it was the default representation?

The jupyter-datatables uses jupyter-require to draw the table.


Installation

pip install jupyter-datatables

Usage

import numpy as np
import pandas as pd

from jupyter_datatables import init_datatables_mode

init_datatables_mode()

That's it, your default pandas representation will now use Jupyter DataTables!

df = pd.DataFrame(np.abs(np.random.randn(50, 5)), columns=list(string.ascii_uppercase[:5]))

Jupyter Datatables table representation


In most cases, you don't need to worry too much about the size of your data. Jupyter DataTables calculates required sample size based on a confidence interval (by default this would be 0.95) and margin of error and ceils it to the highest 'smart' value.

For example, for a data containing 100,000 samples, given 0.975 confidence interval and 0.02 margin of error, the Jupyter DataTables would calculate that 3044 samples are required and it would round it up to 4000.

Jupyter Datatables long table sample size

With additional note:

Sample size: 4,000 out of 100,000


We can also handle wide tables with ease.

df = pd.DataFrame(np.abs(np.random.randn(50, 20)), columns=list(string.ascii_uppercase[:20]))

Jupyter Datatables wide table representation


As per 0.3.0, there is a support for interactive tooltips:

Jupyter Datatables wide table representation

And also support for custom indices including Date type:

dft = pd.DataFrame({'A': np.random.rand(5),
                    'B': [1, 1, 3, 2, 1],
                    'C': 'This is a very long sentence that should automatically be trimmed',
                    'D': [pd.Timestamp('20010101'), pd.Timestamp('20010102'), pd.Timestamp('20010103'), pd.Timestamp('20010104'), pd.Timestamp('20010105')],
                    'E': pd.Series([1.0] * 5).astype('float32'),
                    'F': [False, True, False, False, True],
                   })

dft.D = dft.D.apply(pd.to_datetime)
dft.set_index('D', inplace=True)

Jupyter Datatables wide table representation



Current status and future plans:

Check out the Project Board where we track issues and TODOs for our Jupyter tooling!


Author: Marek Cermak [email protected], @AICoE

Owner
Marek Čermák
DevOps Engineer @ LivesportTV
Marek Čermák
Generate visualizations of GitHub user and repository statistics using GitHub Actions.

GitHub Stats Visualization Generate visualizations of GitHub user and repository statistics using GitHub Actions. This project is currently a work-in-

JoelImgu 3 Dec 14, 2022
Create 3d loss surface visualizations, with optimizer path. Issues welcome!

MLVTK A loss surface visualization tool Simple feed-forward network trained on chess data, using elu activation and Adam optimizer Simple feed-forward

7 Dec 21, 2022
Sentiment Analysis application created with Python and Dash, hosted at socialsentiment.net

Social Sentiment Dash Application Live-streaming sentiment analysis application created with Python and Dash, hosted at SocialSentiment.net. Dash Tuto

Harrison 456 Dec 25, 2022
Curvipy - The Python package for visualizing curves and linear transformations in a super simple way

Curvipy - The Python package for visualizing curves and linear transformations in a super simple way

Dylan Tintenfich 55 Dec 28, 2022
CONTRIBUTIONS ONLY: Voluptuous, despite the name, is a Python data validation library.

CONTRIBUTIONS ONLY What does this mean? I do not have time to fix issues myself. The only way fixes or new features will be added is by people submitt

Alec Thomas 1.8k Dec 31, 2022
This component provides a wrapper to display SHAP plots in Streamlit.

streamlit-shap This component provides a wrapper to display SHAP plots in Streamlit.

Snehan Kekre 30 Dec 10, 2022
A customized interface for single cell track visualisation based on pcnaDeep and napari.

pcnaDeep-napari A customized interface for single cell track visualisation based on pcnaDeep and napari. 👀 Under construction You can get test image

ChanLab 2 Nov 07, 2021
A way of looking at COVID-19 data that I haven't seen before.

Visualizing Omicron: COVID-19 Deaths vs. Cases Click here for other countries. Data is from Our World in Data/Johns Hopkins University. About this pro

1 Jan 10, 2022
A Graph Learning library for Humans

A Graph Learning library for Humans These novel algorithms include but are not limited to: A graph construction and graph searching class can be found

Richard Tjörnhammar 1 Feb 08, 2022
Analysis and plotting for motor/prop/ESC characterization, thrust vs RPM and torque vs thrust

esc_test This is a Python package used to plot and analyze data collected for the purpose of characterizing a particular propeller, motor, and ESC con

Alex Spitzer 1 Dec 28, 2021
🐍PyNode Next allows you to easily create beautiful graph visualisations and animations

PyNode Next A complete rewrite of PyNode for the modern era. Up to five times faster than the original PyNode. PyNode Next allows you to easily create

ehne 3 Feb 12, 2022
This is my favourite function - the Rastrigin function.

This is my favourite function - the Rastrigin function. What sparked my curiosity and interest in the function was its complexity in terms of many local optimum points, which makes it particularly in

1 Dec 27, 2021
This is a small repository for me to implement my simply Data Visualisation skills through Python.

Data Visualisations This is a small repository for me to implement my simply Data Visualisation skills through Python. Steam Population Chart from 10/

9 Dec 31, 2021
Quickly and accurately render even the largest data.

Turn even the largest data into images, accurately Build Status Coverage Latest dev release Latest release Docs Support What is it? Datashader is a da

HoloViz 2.9k Dec 28, 2022
Comparing USD and GBP Exchange Rates

Currency Data Visualization Comparing USD and GBP Exchange Rates This is a bar graph comparing GBP and USD exchange rates. I chose blue for the UK bec

5 Oct 28, 2021
HiPlot makes understanding high dimensional data easy

HiPlot - High dimensional Interactive Plotting HiPlot is a lightweight interactive visualization tool to help AI researchers discover correlations and

Facebook Research 2.4k Jan 04, 2023
An interactive dashboard for visualisation, integration and classification of data using Active Learning.

AstronomicAL An interactive dashboard for visualisation, integration and classification of data using Active Learning. AstronomicAL is a human-in-the-

45 Nov 28, 2022
Multi-class confusion matrix library in Python

Table of contents Overview Installation Usage Document Try PyCM in Your Browser Issues & Bug Reports Todo Outputs Dependencies Contribution References

Sepand Haghighi 1.3k Dec 31, 2022
A blender import/export system for Defold

defold-blender-export A Blender export system for the Defold game engine. Setup Notes There are no exhaustive documents for this tool yet. Its just no

David Lannan 27 Dec 30, 2022
Create SVG drawings from vector geodata files (SHP, geojson, etc).

SVGIS Create SVG drawings from vector geodata files (SHP, geojson, etc). SVGIS is great for: creating small multiples, combining lots of datasets in a

Neil Freeman 78 Dec 09, 2022