Python & Julia port of codes in excellent R books

Overview

X4DS

This repo is a collection of

Python & Julia port of codes in the following excellent R books:

Python Stack Julia Stack
Language
Version
v3.9 v1.7
Data
Processing
  • Pandas
  • DataFrames
  • Visualization
  • Matplotlib
  • Seaborn
  • MakiE
  • AlgebraOfGraphics
  • Machine
    Learning
  • Scikit-Learn
  • MLJ
  • Probablistic
    Programming
  • PyMC
  • Turing
  • Code Styles

    2.1. Basics

    • prefer enumerate() over range(len())
    xs = range(3)
    
    # good
    for ind, x in enumerate(xs):
      print(f'{ind}: {x}')
    
    # bad
    for i in range(len(xs)):
      print(f'{i}: {xs[i]}')

    2.2. Matplotlib

    including seaborn

    • prefer Axes object over Figure object
    • use constrained_layout=True when draw subplots
    # good
    _, axes = plt.subplots(1, 2, constrained_layout=True)
    axes[0].plot(x1, y1)
    axes[1].hist(x2, y2)
    
    # bad
    plt.subplot(121)
    plt.plot(x1, y1)
    plt.subplot(122)
    plt.hist(x2, y2)
    • prefer axes.flatten() over plt.subplot() in cases where subplots' data is iterable
    • prefer zip() or enumerate() over range() for iterable objects
    # good
    _, ax = plt.subplots(2, 2, figsize=[12,8],constrained_layout=True)
    
    for ax, x, y in zip(axes.flatten(), xs, ys):
      ax.plot(x, y)
    
    # bad
    for i in range(4):
      ax = plt.subplot(2, 2, i+1)
      ax.plot(x[i], y[i])
    • prefer set() method over set_*() method
    # good
    ax.set(xlabel='x', ylabel='y')
    
    # bad
    ax.set_xlabel('x')
    ax.set_ylabel('y')
    • Prefer despine() over ax.spines[*].set_visible()
    # good
    sns.despine()
    
    # bad
    ax.spines["top"].set_visible(False)
    ax.spines["bottom"].set_visible(False)
    ax.spines["right"].set_visible(False)
    ax.spines["left"].set_visible(False)

    2.3. Pandas

    • prefer df['col'] over df.col
    # good
    movies['duration']
    
    # bad
    movies.duration
    • prefer df.query over df[] or df.loc[] in simple-selection
    # good
    movies.query('duration >= 200')
    
    # bad
    movies[movies['duration'] >= 200]
    movies.loc[movies['duration'] >= 200, :]
    • prefer df.loc and df.iloc over df[] in multiple-selection
    # good
    movies.loc[movies['duration'] >= 200, 'genre']
    movies.iloc[0:2, :]
    
    # bad
    movies[movies['duration'] >= 200].genre
    movies[0:2]

    LaTeX Styles

    Multiple lines

    Reduce the use of begin{array}...end{array}

    • equations: begin{aligned}...end{aligned}
    $$
    \begin{aligned}
    y_1 = x^2 + 2*x \\
    y_2 = x^3 + x
    \end{aligned}
    $$
    • equations with conditions: begin{cases}...end{cases}
    $$
    \begin{cases}
    y = x^2 + 2*x & x > 0 \\
    y = x^3 + x & x ≤ 0
    \end{cases}
    $$
    • matrix: begin{matrix}...end{matrix}
    $$
    \begin{vmatrix}
      a + a^′ & b + b^′ \\ c & d
      \end{vmatrix}= \begin{vmatrix}
      a & b \\ c & d
      \end{vmatrix} + \begin{vmatrix}
      a^′ & b^′ \\ c & d
    \end{vmatrix}
    $$

    Brackets

    • prefer \Bigg...\Bigg over \left...\right
    $$
    A\Bigg[v_1\ v_2\ \ v_r\Bigg]
    $$
    • prefer \underset{}{} over \underset{}
    $$
    \underset{θ}{\mathrm{argmax}}\ p(x_i|θ)
    $$

    Expressions

    • prefer ^{\top} over ^T for transpose

    $$ 𝐀^⊤ $$

    $$
    𝐀^{\top}
    $$
    • prefer \to over \rightarrow for limit

    $$ \lim_{n → ∞} $$

    $$
    \lim_{n\to \infty}
    $$
    • prefer underset{}{} over \limits_

    $$ \underset{w}{\rm argmin}\ (wx +b) $$

    $$
    \underset{w}{\rm argmin}\ (wx +b)
    $$

    Fonts

    • prefer \mathrm over \mathop or \operatorname
    $$
    θ_{\mathrm{MLE}}=\underset{θ}{\mathrm{argmax}}\ ∑_{i = 1}^{N}\log p(x_i|θ)
    $$

    ISLR

    References

    style <style> table { border-collapse: collapse; text-align: center; } </style>
    Owner
    Gitony
    Gitony
    Visualization of hidden layer activations of small multilayer perceptrons (MLPs)

    MLP Hidden Layer Activation Visualization To gain some intuition about the internal representation of simple multi-layer perceptrons (MLPs) I trained

    Andreas Köpf 7 Dec 30, 2022
    Standardized plots and visualizations in Python

    Standardized plots and visualizations in Python pltviz is a Python package for standardized visualization. Routine and novel plotting approaches are f

    Andrew Tavis McAllister 0 Jul 09, 2022
    An intuitive library to add plotting functionality to scikit-learn objects.

    Welcome to Scikit-plot Single line functions for detailed visualizations The quickest and easiest way to go from analysis... ...to this. Scikit-plot i

    Reiichiro Nakano 2.3k Dec 31, 2022
    Python package to visualize and cluster partial dependence.

    partial_dependence A python library for plotting partial dependence patterns of machine learning classifiers. The technique is a black box approach to

    NYU Visualization Lab 25 Nov 14, 2022
    Automatically Visualize any dataset, any size with a single line of code. Created by Ram Seshadri. Collaborators Welcome. Permission Granted upon Request.

    AutoViz Automatically Visualize any dataset, any size with a single line of code. AutoViz performs automatic visualization of any dataset with one lin

    AutoViz and Auto_ViML 1k Jan 02, 2023
    cqMore is a CadQuery plugin based on CadQuery 2.1.

    cqMore (under construction) cqMore is a CadQuery plugin based on CadQuery 2.1. Installation Please use conda to install CadQuery and its dependencies

    Justin Lin 36 Dec 21, 2022
    A flexible tool for creating, organizing, and sharing visualizations of live, rich data. Supports Torch and Numpy.

    Visdom A flexible tool for creating, organizing, and sharing visualizations of live, rich data. Supports Python. Overview Concepts Setup Usage API To

    FOSSASIA 9.4k Jan 07, 2023
    Python+Numpy+OpenGL: fast, scalable and beautiful scientific visualization

    Python+Numpy+OpenGL: fast, scalable and beautiful scientific visualization

    Glumpy 1.1k Jan 05, 2023
    Productivity Tools for Plotly + Pandas

    Cufflinks This library binds the power of plotly with the flexibility of pandas for easy plotting. This library is available on https://github.com/san

    Jorge Santos 2.7k Dec 30, 2022
    This GitHub Repository contains Data Analysis projects that I have completed so far! While most of th project are focused on Data Analysis, some of them are also put here to show off other skills that I have learned.

    Welcome to my Data Analysis projects page! This GitHub Repository contains Data Analysis projects that I have completed so far! While most of th proje

    Kyle Dini 1 Jan 31, 2022
    Plotting library for IPython/Jupyter notebooks

    bqplot 2-D plotting library for Project Jupyter Introduction bqplot is a 2-D visualization system for Jupyter, based on the constructs of the Grammar

    3.4k Dec 30, 2022
    Visualise top-rated GitHub repositories in a barchart by keyword

    This python script was written for simple purpose -- to visualise top-rated GitHub repositories in a barchart by keyword. Script generates html-page with barchart and information about repository own

    Cur1iosity 2 Feb 07, 2022
    A Python Library for Self Organizing Map (SOM)

    SOMPY A Python Library for Self Organizing Map (SOM) As much as possible, the structure of SOM is similar to somtoolbox in Matlab. It has the followin

    Vahid Moosavi 497 Dec 29, 2022
    Apache Superset is a Data Visualization and Data Exploration Platform

    Superset A modern, enterprise-ready business intelligence web application. Why Superset? | Supported Databases | Installation and Configuration | Rele

    The Apache Software Foundation 50k Jan 06, 2023
    Interactive Dashboard for Visualizing OSM Data Change

    Dashboard and intuitive data downloader for more interactive experience with interpreting osm change data.

    1 Feb 20, 2022
    ScisorWiz: Differential Isoform Visualizer for Long-Read RNA Sequencing Data

    ScisorWiz: Vizualizer for Differential Isoform Expression README ScisorWiz is a linux-based R-package for visualizing differential isoform expression

    Alexander Stein 6 Oct 04, 2022
    OpenStats is a library built on top of streamlit that extracts data from the Github API and shows the main KPIs

    Open Stats Discover and share the KPIs of your OpenSource project. OpenStats is a library built on top of streamlit that extracts data from the Github

    Pere Miquel Brull 4 Apr 03, 2022
    Lumen provides a framework for visual analytics, which allows users to build data-driven dashboards from a simple yaml specification

    Lumen project provides a framework for visual analytics, which allows users to build data-driven dashboards from a simple yaml specification

    HoloViz 120 Jan 04, 2023
    With Holoviews, your data visualizes itself.

    HoloViews Stop plotting your data - annotate your data and let it visualize itself. HoloViews is an open-source Python library designed to make data a

    HoloViz 2.3k Jan 04, 2023
    a robust room presence solution for home automation with nearly no false negatives

    Argos Room Presence This project builds a room presence solution on top of Argos. Using just a cheap raspberry pi zero w (plus an attached pi camera,

    Angad Singh 46 Sep 18, 2022