A simple guide to MLOps through ZenML and its various integrations.

Last update: Dec 27, 2022

Overview

ZenBytes

Join our

Slack Community and become part of the ZenML family

Give the main ZenML repo a

GitHub star to show your love

ZenBytes is a series of practical lessons about MLOps through ZenML and its various integrations. It is intended for people looking to learn about MLOps generally, and also practitioners specifically looking to learn more about ZenML.

🙏 About ZenML

ZenML is an extensible, open-source MLOps framework to create production-ready machine learning pipelines. Built for data scientists, it has a simple, flexible syntax, is cloud- and tool-agnostic, and has interfaces/abstractions that are catered towards ML workflows. The ZenML repository and Docs has more details.

ZenML is a good tool to learn MLOps because of two reasons:

🔹 ZenML focuses on being un-opinionated about underlying tooling and infrastructure across the MLOps stack. 🔹 ZenML presents itself as a pipeline tool, making all development in ZenML data-centric rather than model-centric.

🧱 Structure of Lessons

The lessons are structured in Chapters. Each chapter is a notebook that walks through and explains various concepts:

Chapter 0: Basics
Chapter 1: Building a ML(Ops) pipeline
Chapter 2: Transitioning across stacks
Coming soon: More chapters

💻 System Requirements

In order to run these lessons, you need to have some packages installed on your machine. Note you only need these for some parts, and you might get away with only Python and pip install requirements.txt for some parts of the codebase, but we recommend installing all these:

Currently, this will only run on UNIX systems.

package	MacOS installation	Linux installation
docker	Docker Desktop for Mac	Docker Engine for Linux
kubectl	kubectl for mac	kubectl for linux
k3d	Brew Installation of k3d	k3d installation linux

You might also need to install Anaconda to get the MLflow deployment to work.

🐍 Python Requirements

Once you've got the system requirements figured out, let's jump into the Python packages you need. Within the Python environment of your choice, run:

git clone https://github.com/zenml-io/zenbytes
pip install -r requirements.txt

If you are running the run.py script, you will also need to install some integrations using zenml:

zenml integration install sklearn -f
zenml integration install dash -f
zenml integration install evidently -f
zenml integration install mlflow -f
zenml integration install kubeflow -f
zenml integration install seldon -f

📓 Diving into the code

We're ready to go now. You can go through the notebook step-by-step guide:

jupyter notebook

🏁 Cleaning up when you're done

Once you are done running all notebooks you might want to stop all running processes. For this, run the following command. (This will tear down your k3d cluster and the local docker registry.)

zenml stack set aws_kubeflow_stack
zenml stack down -f
zenml stack set local_kubeflow_stack
zenml stack down -f

❓ FAQ

MacOS When starting the container registry for Kubeflow, I get an error about port 5000 not being available. OSError: [Errno 48] Address already in use

Solution: In order for Kubeflow to run, the docker container registry currently needs to be at port 5000. MacOS, however, uses port 5000 for the Airplay receiver. Here is a guide on how to fix this Freeing up port 5000.

A simple guide to MLOps through ZenML and its various integrations.

Related tags

Overview

ZenBytes

🙏 About ZenML

🧱 Structure of Lessons

💻 System Requirements

🐍 Python Requirements

📓 Diving into the code

🏁 Cleaning up when you're done

❓ FAQ

Owner

ZenML

Simple linear model implementations from scratch.

Pandas Machine Learning and Quant Finance Library Collection

This machine learning model was developed for House Prices

We have a dataset of user performances. The project is to develop a machine learning model that will predict the salaries of baseball players.

In this Repo a simple Sklearn Model will be trained and pushed to MLFlow

Multiple Linear Regression using the LinearRegression class from sklearn.linear_model library

Stacked Generalization (Ensemble Learning)

A Lightweight Hyperparameter Optimization Tool 🚀

决策树分类与回归模型的实现和可视化

Management of exclusive GPU access for distributed machine learning workloads

The Emergence of Individuality

Contains an implementation (sklearn API) of the algorithm proposed in "GENDIS: GEnetic DIscovery of Shapelets" and code to reproduce all experiments.

GRaNDPapA: Generator of Rad Names from Decent Paper Acronyms

A simple example of ML classification, cross validation, and visualization of feature importances

Machine-Learning with python (jupyter)

A mindmap summarising Machine Learning concepts, from Data Analysis to Deep Learning.

PySpark + Scikit-learn = Sparkit-learn

Price forecasting of SGB and IRFC Bonds and comparing there returns

CyLP is a Python interface to COIN-OR’s Linear and mixed-integer program solvers (CLP, CBC, and CGL)

A benchmark of data-centric tasks from across the machine learning lifecycle.