Pandas-method-chaining is a plugin for flake8 that provides method chaining linting for pandas code

Overview

pandas-method-chaining

pandas-method-chaining is a plugin for flake8 that provides method chaining linting for pandas code.

It is a fork from pandas-vet. The global framework of pandas-vet has been reused. All rules have been fully rewritten and adapted to pandas method chaining, except the one dealing with the use of inplace=True.

Motivation

The source of motivation is to help pandas users to write method chaining code style.

Why a fork? The original pandas-vet includes rules which don't deal with method chaining, and some of them are not compatible with this style (e.g. PD005 and PD006 using operators instead of methods).

A source of inspiration was Matt Harrisson's book Effective Pandas.

Limits

  • False positives may occur: e.g., either non pandas statements matching the rules, or intentional style of the programmer.
  • Output messages could be improved: e.g., either too general, or not adapted to specific cases.

Installation

pandas-method-chaining is a plugin for flake8. If you don't have flake8 already, it will install automatically when you install pandas-method-chaining.

For the moment, the plugin is on github only and can be installed, in a dedicated environment, after cloning the repo by:

$ pip install -e .

When this plugin meets its users, it will be added to PyPI to ease the installation.

Usage

Once installed successfully in an environment that also has flake8 installed, pandas-method-chaining should run using:

$ flake8 python_script.py --select=PMC

Contributors

Contributors from pandas-vet

Other contributor

  • fran6w

List of warnings

Except PMC001 which uses a should, other warnings use a could.

PMC001 usage of inplace=True should be avoided

PMC002 reassignment using call could be replaced by method chaining

PMC003 reassignment using subscript could be replaced by method chaining

PMC004 assignment using subscript could be replaced by assign()

PMC005 assignment using attribute could be replaced by assign()

PMC006 assignment of index or columns could be replaced by rename()

PMC007 selection reusing a variable could be performed with a lambda

Owner
Francis
Computer & Data Scientist - Data & AI Consultant - Python and Data Science Trainer & Teacher
Francis
Machine Learning Algorithms

Machine-Learning-Algorithms In this project, the dataset was created through a survey opened on Google forms. The purpose of the form is to find the p

Göktuğ Ayar 3 Aug 10, 2022
Python Machine Learning Jupyter Notebooks (ML website)

Python Machine Learning Jupyter Notebooks (ML website) Dr. Tirthajyoti Sarkar, Fremont, California (Please feel free to connect on LinkedIn here) Also

Tirthajyoti Sarkar 2.6k Jan 03, 2023
neurodsp is a collection of approaches for applying digital signal processing to neural time series

neurodsp is a collection of approaches for applying digital signal processing to neural time series, including algorithms that have been proposed for the analysis of neural time series. It also inclu

NeuroDSP 224 Dec 02, 2022
Greykite: A flexible, intuitive and fast forecasting library

The Greykite library provides flexible, intuitive and fast forecasts through its flagship algorithm, Silverkite.

LinkedIn 1.4k Jan 15, 2022
TorchDrug is a PyTorch-based machine learning toolbox designed for drug discovery

A powerful and flexible machine learning platform for drug discovery

MilaGraph 1.1k Jan 08, 2023
XAI - An eXplainability toolbox for machine learning

XAI - An eXplainability toolbox for machine learning XAI is a Machine Learning library that is designed with AI explainability in its core. XAI contai

The Institute for Ethical Machine Learning 875 Dec 27, 2022
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective. 10x Larger Models 10x Faster Trainin

Microsoft 8.4k Dec 30, 2022
A Multipurpose Library for Synthetic Time Series Generation in Python

TimeSynth Multipurpose Library for Synthetic Time Series Please cite as: J. R. Maat, A. Malali, and P. Protopapas, “TimeSynth: A Multipurpose Library

278 Dec 26, 2022
Distributed Evolutionary Algorithms in Python

DEAP DEAP is a novel evolutionary computation framework for rapid prototyping and testing of ideas. It seeks to make algorithms explicit and data stru

Distributed Evolutionary Algorithms in Python 4.9k Jan 05, 2023
Tutorial for Decision Threshold In Machine Learning.

Decision-Threshold-ML Tutorial for improve skills: 'Decision Threshold In Machine Learning' (from GeeksforGeeks) by Marcus Mariano For more informatio

0 Jan 20, 2022
This is a curated list of medical data for machine learning

Medical Data for Machine Learning This is a curated list of medical data for machine learning. This list is provided for informational purposes only,

Andrew L. Beam 5.4k Dec 26, 2022
XGBoost-Ray is a distributed backend for XGBoost, built on top of distributed computing framework Ray.

XGBoost-Ray is a distributed backend for XGBoost, built on top of distributed computing framework Ray.

92 Dec 14, 2022
Extreme Learning Machine implementation in Python

Python-ELM v0.3 --- ARCHIVED March 2021 --- This is an implementation of the Extreme Learning Machine [1][2] in Python, based on scikit-learn. From

David C. Lambert 511 Dec 20, 2022
MIT-Machine Learning with Python–From Linear Models to Deep Learning

MIT-Machine Learning with Python–From Linear Models to Deep Learning | One of the 5 courses in MIT MicroMasters in Statistics & Data Science Welcome t

2 Aug 23, 2022
Simple linear model implementations from scratch.

Hand Crafted Models Simple linear model implementations from scratch. Table of contents Overview Project Structure Getting started Citing this project

Jonathan Sadighian 2 Sep 13, 2021
Toolkit for building machine learning models that generalize to unseen domains and are robust to privacy and other attacks.

Toolkit for Building Robust ML models that generalize to unseen domains (RobustDG) Divyat Mahajan, Shruti Tople, Amit Sharma Privacy & Causal Learning

Microsoft 149 Jan 06, 2023
LinearRegression2 Tvads and CarSales

LinearRegression2_Tvads_and_CarSales This project infers the insight that how the TV ads for cars and car Sales are being linked with each other. It i

Ashish Kumar Yadav 1 Dec 29, 2021
flexible time-series processing & feature extraction

A corona statistics and information telegram bot.

PreDiCT.IDLab 206 Dec 28, 2022
Machine Learning e Data Science com Python

Machine Learning e Data Science com Python Arquivos do curso de Data Science e Machine Learning com Python na Udemy, cliqe aqui para acessá-lo. O prin

Renan Barbosa 1 Jan 27, 2022
Python library for multilinear algebra and tensor factorizations

scikit-tensor is a Python module for multilinear algebra and tensor factorizations

Maximilian Nickel 394 Dec 09, 2022