A workshop on data visualization in Python with notebooks and exercises for following along.

Overview

Beyond the Basics: Data Visualization in Python

Binder Nbviewer View slides in browser

The human brain excels at finding patterns in visual representations, which is why data visualizations are essential to any analysis. Done right, they bridge the gap between those analyzing the data and those consuming the analysis. However, learning to create impactful, aesthetically-pleasing visualizations can often be challenging. This session will equip you with the skills to make customized visualizations for your data using Python.

While there are many plotting libraries to choose from, the prolific Matplotlib library is always a great place to start. Since various Python data science libraries utilize Matplotlib under the hood, familiarity with Matplotlib itself gives you the flexibility to fine tune the resulting visualizations (e.g., add annotations, animate, etc.). This session will also introduce interactive visualizations using HoloViz, which provides a higher-level plotting API capable of using Matplotlib and Bokeh (a Python library for generating interactive, JavaScript-powered visualizations) under the hood.

Workshop Outline

This is a workshop on data visualization in Python first delivered at ODSC West 2021 and subsequently at ODSC East 2022 and PyCon Italia 2022. It's divided into the following sections:

Section 1: Getting Started With Matplotlib

We will begin by familiarizing ourselves with Matplotlib. Moving beyond the default options, we will explore how to customize various aspects of our visualizations. By the end of this section, you will be able to generate plots using the Matplotlib API directly, as well as customize the plots that libraries like pandas and Seaborn create for you.

Section 2: Moving Beyond Static Visualizations

Static visualizations are limited in how much information they can show. To move beyond these limitations, we can create animated and/or interactive visualizations. Animations make it possible for our visualizations to tell a story through movement of the plot components (e.g., bars, points, lines). Interactivity makes it possible to explore the data visually by hiding and displaying information based on user interest. In this section, we will focus on creating animated visualizations using Matplotlib before moving on to create interactive visualizations in the next section.

Section 3: Building Interactive Visualizations for Data Exploration

When exploring our data, interactive visualizations can provide the most value. Without having to create multiple iterations of the same plot, we can use mouse actions (e.g., click, hover, zoom, etc.) to explore different aspects and subsets of the data. In this section, we will learn how to use a few of the libraries in the HoloViz ecosystem to create interactive visualizations for exploring our data utilizing the Bokeh backend.


Prerequisites

You should have basic knowledge of Python and be comfortable working in Jupyter Notebooks. Check out this notebook for a crash course in Python or work through the official Python tutorial for a more formal introduction. The environment we will use for this workshop comes with JupyterLab, which is pretty intuitive, but be sure to familiarize yourself using notebooks in JupyterLab and additional functionality in JupyterLab. In addition, a basic understanding of pandas will be beneficial, but is not required; reviewing the first section of my pandas workshop will be sufficient.


Setup Instructions

  1. Install Anaconda/Miniconda. Note that you can use this Binder environment instead if you don't want to install anything on your machine.

  2. Fork this repository:

    location of fork button in GitHub

  3. Clone your forked repository:

    location of clone button in GitHub

  4. Create and activate a conda virtual environment (on Windows, these commands should be run in Anaconda Prompt):

    $ cd python-data-viz-workshop
    ~/python-data-viz-workshop$ conda install mamba -n base -c conda-forge
    ~/python-data-viz-workshop$ mamba env create --file environment.yml
    ~/python-data-viz-workshop$ conda activate data_viz_workshop
    (data_viz_workshop) ~/python-data-viz-workshop$
  5. Launch JupyterLab:

    (data_viz_workshop) ~/python-data-viz-workshop$ jupyter lab
  6. Navigate to the 0-check_your_env.ipynb notebook in the notebooks/ folder:

    open 0-check_your_env.ipynb

  7. Run the notebook to confirm everything is set up properly:

    check env


About the Author

Stefanie Molin (@stefmolin) is a software engineer and data scientist at Bloomberg in New York City, where she tackles tough problems in information security, particularly those revolving around data wrangling/visualization, building tools for gathering data, and knowledge sharing. She is also the author of Hands-On Data Analysis with Pandas, which is currently in its second edition. She holds a bachelor’s of science degree in operations research from Columbia University's Fu Foundation School of Engineering and Applied Science. She is currently pursuing a master’s degree in computer science, with a specialization in machine learning, from Georgia Tech. In her free time, she enjoys traveling the world, inventing new recipes, and learning new languages spoken among both people and computers.

Related Content

All examples herein were developed exclusively for this workshop. Hands-On Data Analysis with Pandas contains additional examples and exercises, as does this blog post and this workshop on pandas.

Owner
Stefanie Molin
Developer | Data Scientist | Author of "Hands-On Data Analysis with Pandas" | occasional hacker
Stefanie Molin
A curated list of awesome Dash (plotly) resources

Awesome Dash A curated list of awesome Dash (plotly) resources Dash is a productive Python framework for building web applications. Written on top of

Luke Singham 1.7k Jan 07, 2023
哔咔漫画window客户端,界面使用PySide2,已实现分类、搜索、收藏夹、下载、在线观看、waifu2x等功能。

picacomic-windows 哔咔漫画window客户端,界面使用PySide2,已实现分类、搜索、收藏夹、下载、在线观看等功能。 功能介绍 登陆分流,还原安卓端的三个分流入口 分类,搜索,排行,收藏夹使用同一的逻辑,滚轮下滑自动加载下一页,双击打开 漫画详情,章节列表和评论列表 下载功能,目

1.8k Dec 31, 2022
Matplotlib tutorial for beginner

matplotlib is probably the single most used Python package for 2D-graphics. It provides both a very quick way to visualize data from Python and publication-quality figures in many formats. We are goi

Nicolas P. Rougier 2.6k Dec 28, 2022
Getting started with Python, Dash and Plot.ly for the Data Dashboards team

data_dashboards Getting started with Python, Dash and Plot.ly for the Data Dashboards team Getting started MacOS users: # Install the pyenv version ma

Department for Levelling Up, Housing and Communities 1 Nov 08, 2021
Realtime Viewer Mandelbrot set with Python and Taichi (cpu, opengl, cuda, vulkan, metal)

Mandelbrot-set-Realtime-Viewer- Realtime Viewer Mandelbrot set with Python and Taichi (cpu, opengl, cuda, vulkan, metal) Control: "WASD" - movement, "

22 Oct 31, 2022
Python Data. Leaflet.js Maps.

folium Python Data, Leaflet.js Maps folium builds on the data wrangling strengths of the Python ecosystem and the mapping strengths of the Leaflet.js

6k Jan 02, 2023
WhatsApp Chat Analyzer is a WebApp and it can be used by anyone to analyze their chat. 😄

WhatsApp-Chat-Analyzer You can view the working project here. WhatsApp chat Analyzer is a WebApp where anyone either tech or non-tech person can analy

Prem Chandra Singh 26 Nov 02, 2022
Define fortify and autoplot functions to allow ggplot2 to handle some popular R packages.

ggfortify This package offers fortify and autoplot functions to allow automatic ggplot2 to visualize statistical result of popular R packages. Check o

Sinhrks 504 Dec 23, 2022
Make your BSC transaction simple.

bsc_trade_history Make your BSC transaction simple. 中文ReadMe Background: inspired by debank ,Practice my hands on this small project Blog:Crypto-BscTr

foolisheddy 7 Jul 06, 2022
MPL Plotter is a Matplotlib based Python plotting library built with the goal of delivering publication-quality plots concisely.

MPL Plotter is a Matplotlib based Python plotting library built with the goal of delivering publication-quality plots concisely.

Antonio López Rivera 162 Nov 11, 2022
Color maps for POV-Ray v3.7 from the Plasma, Inferno, Magma and Viridis color maps in Python's Matplotlib

POV-Ray-color-maps Color maps for POV-Ray v3.7 from the Plasma, Inferno, Magma and Viridis color maps in Python's Matplotlib. The include file Color_M

Tor Olav Kristensen 1 Apr 05, 2022
This is my favourite function - the Rastrigin function.

This is my favourite function - the Rastrigin function. What sparked my curiosity and interest in the function was its complexity in terms of many local optimum points, which makes it particularly in

1 Dec 27, 2021
A python script editor for napari based on PyQode.

napari-script-editor A python script editor for napari based on PyQode. This napari plugin was generated with Cookiecutter using with @napari's cookie

Robert Haase 9 Sep 20, 2022
Movie recommendation using RASA, TigerGraph

Demo run: The below video will highlight the runtime of this setup and some sample real-time conversations using the power of RASA + TigerGraph, Steps

Sudha Vijayakumar 3 Sep 10, 2022
This is a small repository for me to implement my simply Data Visualisation skills through Python.

Data Visualisations This is a small repository for me to implement my simply Data Visualisation skills through Python. Steam Population Chart from 10/

9 Dec 31, 2021
a python function to plot a geopandas dataframe

Pretty GeoDataFrame A minimum python function (~60 lines) to draw pretty geodataframe. Based on matplotlib, shapely, descartes. Installation just use

haoming 27 Dec 05, 2022
Gesture controlled media player

Media Player Gesture Control Gesture controller for media player with MediaPipe, VLC and OpenCV. Contents About Setup About A tool for using gestures

Atharva Joshi 2 Dec 22, 2021
Some useful extensions for Matplotlib.

mplx Some useful extensions for Matplotlib. Contour plots for functions with discontinuities plt.contour mplx.contour(max_jump=1.0) Matplotlib has pro

Nico Schlömer 519 Dec 30, 2022
Visualizations of linear algebra algorithms for people who want a deep understanding

Visualising algorithms on symmetric matrices Examples QR algorithm and LR algorithm Here, we have a GIF animation of an interactive visualisation of t

ogogmad 3 May 05, 2022
paintable GitHub contribute table

githeart paintable github contribute table how to use: Functions key color select 1,2,3,4,5 clear c drawing mode mode on turn off e print paint matrix

Bahadır Araz 27 Nov 24, 2022