Efficient Python Tricks and Tools for Data Scientists

Overview

View on GitHub View Book

Efficient Python Tricks and Tools for Data Scientists

Why efficient Python? Because using Python more efficiently will make your code more readable and run more efficiently.

Why for data scientist? Because Python has a wide application. The Python tools used in the data science field are not necessarily useful for other fields such as web development.

The goal of this book is to spread the awareness of efficient ways to do Python. They include:

  • efficient built-in methods and libraries to work with iterator, dictionary, function, and class
  • efficient methods to work with popular data science libraries such as pandas and NumPy
  • efficient tools to incorporate in a data science project
  • efficient tools to incorporate in any project
  • efficient tools to work with Jupyter Notebook.

image

What Should You Expect From This Book?

This book expects you to have some basic knowledge of Python and data science.

You should also expect bite-size code snippets for each section. This will allow you to obtain multiple pieces of knowledge in fewer than one minute. I included the link to the resources for every tools introduced in case you want to explore them further.

About This Book

This book includes more than 300 tips and tools I have shared daily on my website, Data Science Simplified. If you want to get the updated of new tips on your mailbox, you can subscribe to my website.

About The Author

image

Khuyen Tran is a data science writer at NVIDIA and a data science intern at Ocelot Consulting. She wrote over 150 data science articles with 100k+ views per month on Towards Data Science. She also wrote 300+ daily data science tips at Data Science Simplified. Her current mission is to make open-source more accessible to the data science community.

Supporters

Special thanks to these supporters for supporting this project!

Owner
Khuyen Tran
Data Scientist | Data Science Writer at NVIDIA & Towards Data Science
Khuyen Tran
SCICO is a Python package for solving the inverse problems that arise in scientific imaging applications.

Scientific Computational Imaging COde (SCICO) SCICO is a Python package for solving the inverse problems that arise in scientific imaging applications

Los Alamos National Laboratory 37 Dec 21, 2022
CS 506 - Computational Tools for Data Science

CS 506 - Computational Tools for Data Science Code, slides, and notes for Boston University CS506 Fall 2021 The Final Project Repository can be found

Lance Galletti 14 Mar 23, 2022
collection of interesting Computer Science resources

collection of interesting Computer Science resources

Kirill Bobyrev 137 Dec 22, 2022
A modular single-molecule analysis interface

MOSAIC: A modular single-molecule analysis interface MOSAIC is a single molecule analysis toolbox that automatically decodes multi-state nanopore data

National Institute of Standards and Technology 35 Dec 13, 2022
Efficient Python Tricks and Tools for Data Scientists

Why efficient Python? Because using Python more efficiently will make your code more readable and run more efficiently.

Khuyen Tran 944 Dec 28, 2022
CoCalc: Collaborative Calculation in the Cloud

logo CoCalc Collaborative Calculation and Data Science CoCalc is a virtual online workspace for calculations, research, collaboration and authoring do

SageMath, Inc. 1k Dec 29, 2022
A computer algebra system written in pure Python

SymPy See the AUTHORS file for the list of authors. And many more people helped on the SymPy mailing list, reported bugs, helped organize SymPy's part

SymPy 9.9k Jan 08, 2023
A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.

Cookiecutter Data Science A logical, reasonably standardized, but flexible project structure for doing and sharing data science work. Project homepage

6.4k Jan 02, 2023
3D medical imaging reconstruction software

InVesalius InVesalius generates 3D medical imaging reconstructions based on a sequence of 2D DICOM files acquired with CT or MRI equipments. InVesaliu

443 Jan 01, 2023
Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis. You write a high level configuration file specifying your in

Blue Collar Bioinformatics 915 Dec 29, 2022
ckan 3.6k Dec 27, 2022
Data intensive science for everyone.

InVesalius InVesalius generates 3D medical imaging reconstructions based on a sequence of 2D DICOM files acquired with CT or MRI equipments. InVesaliu

Galaxy Project 1k Jan 08, 2023
A simple computer program made with Python on the brachistochrone curve.

Brachistochrone-curve This is a simple computer program made with Python on the brachistochrone curve. I decided to write it after a physics lesson on

Diego Romeo 1 Dec 16, 2021
Book on Julia for Data Science

Book on Julia for Data Science

Julia Data Science 349 Dec 25, 2022
Animation engine for explanatory math videos

Manim is an engine for precise programatic animations, designed for creating explanatory math videos. Note, there are two versions of manim. This repo

Grant Sanderson 48.9k Jan 03, 2023
AnuGA for the simulation of the shallow water equation

ANUGA Contents ANUGA What is ANUGA? Installation Documentation and Help Mailing Lists Web sites Latest source code Bug reports Developer information L

Geoscience Australia 147 Dec 14, 2022
🍊 :bar_chart: :bulb: Orange: Interactive data analysis

Orange Data Mining Orange is a data mining and visualization toolbox for novice and expert alike. To explore data with Orange, one requires no program

Bioinformatics Laboratory 3.9k Jan 05, 2023
Doing bayesian data analysis - Python/PyMC3 versions of the programs described in Doing bayesian data analysis by John K. Kruschke

Doing_bayesian_data_analysis This repository contains the Python version of the R programs described in the great book Doing bayesian data analysis (f

Osvaldo Martin 851 Dec 27, 2022
PennyLane is a cross-platform Python library for differentiable programming of quantum computers.

PennyLane is a cross-platform Python library for differentiable programming of quantum computers. Train a quantum computer the same way as a neural network.

PennyLaneAI 1.6k Jan 04, 2023
A framework for feature exploration in Data Science

Beehive A framework for feature exploration in Data Science Background What do we do when we finish one episode of feature exploration in a jupyter no

Steven IJ 1 Jan 03, 2022