Python & Julia port of codes in excellent R books

Overview

X4DS

This repo is a collection of

Python & Julia port of codes in the following excellent R books:

Python Stack Julia Stack
Language
Version
v3.9 v1.7
Data
Processing
  • Pandas
  • DataFrames
  • Visualization
  • Matplotlib
  • Seaborn
  • MakiE
  • AlgebraOfGraphics
  • Machine
    Learning
  • Scikit-Learn
  • MLJ
  • Probablistic
    Programming
  • PyMC
  • Turing
  • Code Styles

    2.1. Basics

    • prefer enumerate() over range(len())
    xs = range(3)
    
    # good
    for ind, x in enumerate(xs):
      print(f'{ind}: {x}')
    
    # bad
    for i in range(len(xs)):
      print(f'{i}: {xs[i]}')

    2.2. Matplotlib

    including seaborn

    • prefer Axes object over Figure object
    • use constrained_layout=True when draw subplots
    # good
    _, axes = plt.subplots(1, 2, constrained_layout=True)
    axes[0].plot(x1, y1)
    axes[1].hist(x2, y2)
    
    # bad
    plt.subplot(121)
    plt.plot(x1, y1)
    plt.subplot(122)
    plt.hist(x2, y2)
    • prefer axes.flatten() over plt.subplot() in cases where subplots' data is iterable
    • prefer zip() or enumerate() over range() for iterable objects
    # good
    _, ax = plt.subplots(2, 2, figsize=[12,8],constrained_layout=True)
    
    for ax, x, y in zip(axes.flatten(), xs, ys):
      ax.plot(x, y)
    
    # bad
    for i in range(4):
      ax = plt.subplot(2, 2, i+1)
      ax.plot(x[i], y[i])
    • prefer set() method over set_*() method
    # good
    ax.set(xlabel='x', ylabel='y')
    
    # bad
    ax.set_xlabel('x')
    ax.set_ylabel('y')
    • Prefer despine() over ax.spines[*].set_visible()
    # good
    sns.despine()
    
    # bad
    ax.spines["top"].set_visible(False)
    ax.spines["bottom"].set_visible(False)
    ax.spines["right"].set_visible(False)
    ax.spines["left"].set_visible(False)

    2.3. Pandas

    • prefer df['col'] over df.col
    # good
    movies['duration']
    
    # bad
    movies.duration
    • prefer df.query over df[] or df.loc[] in simple-selection
    # good
    movies.query('duration >= 200')
    
    # bad
    movies[movies['duration'] >= 200]
    movies.loc[movies['duration'] >= 200, :]
    • prefer df.loc and df.iloc over df[] in multiple-selection
    # good
    movies.loc[movies['duration'] >= 200, 'genre']
    movies.iloc[0:2, :]
    
    # bad
    movies[movies['duration'] >= 200].genre
    movies[0:2]

    LaTeX Styles

    Multiple lines

    Reduce the use of begin{array}...end{array}

    • equations: begin{aligned}...end{aligned}
    $$
    \begin{aligned}
    y_1 = x^2 + 2*x \\
    y_2 = x^3 + x
    \end{aligned}
    $$
    • equations with conditions: begin{cases}...end{cases}
    $$
    \begin{cases}
    y = x^2 + 2*x & x > 0 \\
    y = x^3 + x & x ≤ 0
    \end{cases}
    $$
    • matrix: begin{matrix}...end{matrix}
    $$
    \begin{vmatrix}
      a + a^′ & b + b^′ \\ c & d
      \end{vmatrix}= \begin{vmatrix}
      a & b \\ c & d
      \end{vmatrix} + \begin{vmatrix}
      a^′ & b^′ \\ c & d
    \end{vmatrix}
    $$

    Brackets

    • prefer \Bigg...\Bigg over \left...\right
    $$
    A\Bigg[v_1\ v_2\ \ v_r\Bigg]
    $$
    • prefer \underset{}{} over \underset{}
    $$
    \underset{θ}{\mathrm{argmax}}\ p(x_i|θ)
    $$

    Expressions

    • prefer ^{\top} over ^T for transpose

    $$ 𝐀^⊤ $$

    $$
    𝐀^{\top}
    $$
    • prefer \to over \rightarrow for limit

    $$ \lim_{n → ∞} $$

    $$
    \lim_{n\to \infty}
    $$
    • prefer underset{}{} over \limits_

    $$ \underset{w}{\rm argmin}\ (wx +b) $$

    $$
    \underset{w}{\rm argmin}\ (wx +b)
    $$

    Fonts

    • prefer \mathrm over \mathop or \operatorname
    $$
    θ_{\mathrm{MLE}}=\underset{θ}{\mathrm{argmax}}\ ∑_{i = 1}^{N}\log p(x_i|θ)
    $$

    ISLR

    References

    style <style> table { border-collapse: collapse; text-align: center; } </style>
    Owner
    Gitony
    Gitony
    Data Visualizations for the #30DayChartChallenge

    The #30DayChartChallenge This repository contains all the charts made for the #30DayChartChallenge during the month of April. This project aims to exp

    Isaac Arroyo 7 Sep 20, 2022
    Statistical data visualization using matplotlib

    seaborn: statistical data visualization Seaborn is a Python visualization library based on matplotlib. It provides a high-level interface for drawing

    Michael Waskom 10.2k Dec 30, 2022
    Tweets your monthly GitHub Contributions as Wordle grid

    Tweets your monthly GitHub Contributions as Wordle grid

    Venu Vardhan Reddy Tekula 5 Feb 16, 2022
    Tidy data structures, summaries, and visualisations for missing data

    naniar naniar provides principled, tidy ways to summarise, visualise, and manipulate missing data with minimal deviations from the workflows in ggplot

    Nicholas Tierney 611 Dec 22, 2022
    Lightweight, extensible data validation library for Python

    Cerberus Cerberus is a lightweight and extensible data validation library for Python. v = Validator({'name': {'type': 'string'}}) v.validate({

    eve 2.9k Dec 27, 2022
    DataVisualization - The evolution of my arduino and python journey. New level of competence achieved

    DataVisualization - The evolution of my arduino and python journey. New level of competence achieved

    1 Jan 03, 2022
    By default, networkx has problems with drawing self-loops in graphs.

    By default, networkx has problems with drawing self-loops in graphs. It makes it hard to draw a graph with self-loops or to make a nicely looking chord diagram. This repository provides some code to

    Vladimir Shitov 5 Jan 06, 2022
    A toolkit to generate MR sequence diagrams

    mrsd: a toolkit to generate MR sequence diagrams mrsd is a Python toolkit to generate MR sequence diagrams, as shown below for the basic FLASH sequenc

    Julien Lamy 3 Dec 25, 2021
    Small U-Net for vehicle detection

    Small U-Net for vehicle detection Vivek Yadav, PhD Overview In this repository , we will go over using U-net for detecting vehicles in a video stream

    Vivek Yadav 91 Nov 03, 2022
    OpenStats is a library built on top of streamlit that extracts data from the Github API and shows the main KPIs

    Open Stats Discover and share the KPIs of your OpenSource project. OpenStats is a library built on top of streamlit that extracts data from the Github

    Pere Miquel Brull 4 Apr 03, 2022
    Simple, realtime visualization of neural network training performance.

    pastalog Simple, realtime visualization server for training neural networks. Use with Lasagne, Keras, Tensorflow, Torch, Theano, and basically everyth

    Rewon Child 416 Dec 29, 2022
    2021 grafana arbitrary file read

    2021_grafana_arbitrary_file_read base on pocsuite3 try 40 default plugins of grafana alertlist annolist barchart cloudwatch dashlist elasticsearch gra

    ATpiu 5 Nov 09, 2022
    Visualization Website by using Dash and Heroku

    Visualization Website by using Dash and Heroku You can visit the website https://payroll-expense-analysis.herokuapp.com/ In this project, I am interes

    YF Liu 1 Jan 14, 2022
    Backend app for visualizing CANedge log files in Grafana (directly from local disk or S3)

    CANedge Grafana Backend - Visualize CAN/LIN Data in Dashboards This project enables easy dashboard visualization of log files from the CANedge CAN/LIN

    13 Dec 15, 2022
    An animation engine for explanatory math videos

    Powered By: An animation engine for explanatory math videos Hi there, I'm Zheer 👋 I'm a Software Engineer and student!! 🌱 I’m currently learning eve

    Zaheer ud Din Faiz 2 Nov 04, 2021
    In-memory Graph Database and Knowledge Graph with Natural Language Interface, compatible with Pandas

    CogniPy for Pandas - In-memory Graph Database and Knowledge Graph with Natural Language Interface Whats in the box Reasoning, exploration of RDF/OWL,

    Cognitum Octopus 34 Dec 13, 2022
    Curvipy - The Python package for visualizing curves and linear transformations in a super simple way

    Curvipy - The Python package for visualizing curves and linear transformations in a super simple way

    Dylan Tintenfich 55 Dec 28, 2022
    Personal IMDB Graphs with Bokeh

    Personal IMDB Graphs with Bokeh Do you like watching movies and also rate all of them in IMDB? Would you like to look at your IMDB stats based on your

    2 Dec 15, 2021
    University of Missouri - Kansas City: CS451R: Capstone

    CS451RC University of Missouri - Kansas City: CS451R: Capstone Installation cd git clone https://github.com/ala2q6/CS451RC.git cd CS451RC pip3 instal

    Alex Arbuckle 1 Nov 17, 2021
    Create charts with Python in a very similar way to creating charts using Chart.js

    Create charts with Python in a very similar way to creating charts using Chart.js. The charts created are fully configurable, interactive and modular and are displayed directly in the output of the t

    Nicolas H 68 Dec 08, 2022