Compare MLOps Platforms. Breakdowns of SageMaker, VertexAI, AzureML, Dataiku, Databricks, h2o, kubeflow, mlflow...

Overview

MLOps Platforms

MLOps is an especially confusing landscape with hundreds of tools available. This project helps to navigate the space of MLOps platforms.

Understanding MLOps platforms is complex. Platforms have their own specializations and there is no clear line between a tool (with a narrow focus) and a platform (which supports many ML lifecycle activities). The below (from the Thoughtworks Guide to MLOps Platforms) illustrates how some of the platforms specialize in particular areas (bottom) and others aim to cover the whole lifecycle with equal focus (top):

MLOps Landscape Diagram

Even platforms that have a similar scope have different concepts and strategies, making them hard to compare directly. This repository provides resources for evaluating MLOps platforms.

If you're wondering what process to use to evaluate MLOps platforms, see the Thoughtworks Guide. If you know how to evaluate MLOps platforms and want materials, read on.

Comparison Matrix Format

The matrix an open format of categories with links to vendor documentation within cells to highlight features. This lets vendors do things their own ways and helps readers find the detail they need.

Comparison Matrix

We suggest to click through to the master spreadsheet in google sheets:

matrix

If you can't access or don't like google sheets then there is a translation of the matrix into Github markdown

Platform Profiles

These profiles are concise marketing-free introductions to key concepts of MLOps platforms. This provides just enough context to make sense of the features in the matrix.

Contributions

Everyone is welcome to contribute, including vendors. Language should be neutral - marketing language will not be accepted.

Changes are welcome by PR or issues - please create a copy of the spreadsheet, link to or upload your copy and explain which parts are changed. Please follow the existing format or raise an issue in advance to suggest changes to the format. On approval a maintainer will then update the master spreadsheet used to generate the markdown.

Disclaimer

We do our best to keep this information accurate and up-to-date but cannot provide guarantees. References to documentation are provided throughout so readers can check for themselves. If you spot anything inaccurate then please raise an issue or pull request (see Contributing section).

Owner
Thoughtworks
Thoughtworks
Automated machine learning: Review of the state-of-the-art and opportunities for healthcare

Automated machine learning: Review of the state-of-the-art and opportunities for healthcare

42 Dec 23, 2022
🌊 River is a Python library for online machine learning.

River is a Python library for online machine learning. It is the result of a merger between creme and scikit-multiflow. River's ambition is to be the go-to library for doing machine learning on strea

OnlineML 4k Jan 03, 2023
GroundSeg Clustering Optimized Kdtree

ground seg and clustering based on kitti velodyne data, and a additional optimized kdtree for knn and radius nn search

2 Dec 02, 2021
Predico Disease Prediction system based on symptoms provided by patient- using Python-Django & Machine Learning

Predico Disease Prediction system based on symptoms provided by patient- using Python-Django & Machine Learning

Felix Daudi 1 Jan 06, 2022
Greykite: A flexible, intuitive and fast forecasting library

The Greykite library provides flexible, intuitive and fast forecasts through its flagship algorithm, Silverkite.

LinkedIn 1.7k Jan 04, 2023
Sleep stages are classified with the help of ML. We have used 4 different ML algorithms (SVM, KNN, RF, NN) to demonstrate them

Sleep stages are classified with the help of ML. We have used 4 different ML algorithms (SVM, KNN, RF, NN) to demonstrate them.

Anirudh Edpuganti 3 Apr 03, 2022
Combines MLflow with a database (PostgreSQL) and a reverse proxy (NGINX) into a multi-container Docker application

Combines MLflow with a database (PostgreSQL) and a reverse proxy (NGINX) into a multi-container Docker application (with docker-compose).

Philip May 2 Dec 03, 2021
This machine learning model was developed for House Prices

This machine learning model was developed for House Prices - Advanced Regression Techniques competition in Kaggle by using several machine learning models such as Random Forest, XGBoost and LightGBM.

serhat_derya 1 Mar 02, 2022
Polyglot Machine Learning example for scraping similar news articles.

Polyglot Machine Learning example for scraping similar news articles In this example, we will see how we can work with Machine Learning applications w

MetaCall 15 Mar 28, 2022
Machine learning model evaluation made easy: plots, tables, HTML reports, experiment tracking and Jupyter notebook analysis.

sklearn-evaluation Machine learning model evaluation made easy: plots, tables, HTML reports, experiment tracking, and Jupyter notebook analysis. Suppo

Eduardo Blancas 354 Dec 31, 2022
TIANCHI Purchase Redemption Forecast Challenge

TIANCHI Purchase Redemption Forecast Challenge

Haorui HE 4 Aug 26, 2022
A quick reference guide to the most commonly used patterns and functions in PySpark SQL

Using PySpark we can process data from Hadoop HDFS, AWS S3, and many file systems. PySpark also is used to process real-time data using Streaming and

Sundar Ramamurthy 53 Dec 21, 2022
Kalman filter library

The kalman filter framework described here is an incredibly powerful tool for any optimization problem, but particularly for visual odometry, sensor fusion localization or SLAM.

comma.ai 276 Jan 01, 2023
Class-imbalanced / Long-tailed ensemble learning in Python. Modular, flexible, and extensible

IMBENS: Class-imbalanced Ensemble Learning in Python Language: English | Chinese/中文 Links: Documentation | Gallery | PyPI | Changelog | Source | Downl

Zhining Liu 176 Jan 04, 2023
Summer: compartmental disease modelling in Python

Summer: compartmental disease modelling in Python Summer is a Python-based framework for the creation and execution of compartmental (or "state-based"

6 May 13, 2022
#30DaysOfStreamlit is a 30-day social challenge for you to build and deploy Streamlit apps.

30 Days Of Streamlit 🎈 This is the official repo of #30DaysOfStreamlit — a 30-day social challenge for you to learn, build and deploy Streamlit apps.

Streamlit 53 Jan 02, 2023
MLR - Machine Learning Research

Machine Learning Research 1. Project Topic 1.1. Exsiting research Benmark: https://paperswithcode.com/sota ACL anthology for NLP papers: http://www.ac

Charles 69 Oct 20, 2022
Pandas DataFrames and Series as Interactive Tables in Jupyter

Pandas DataFrames and Series as Interactive Tables in Jupyter Star Turn pandas DataFrames and Series into interactive datatables in both your notebook

Marc Wouts 364 Jan 04, 2023
Pydantic based mock data generation

This library offers powerful mock data generation capabilities for pydantic based models. It can also be used with other libraries that use pydantic as a foundation, for example SQLModel, Beanie and

Na'aman Hirschfeld 396 Dec 28, 2022
In this Repo a simple Sklearn Model will be trained and pushed to MLFlow

SKlearn_to_MLFLow In this Repo a simple Sklearn Model will be trained and pushed to MLFlow Install This Repo is based on poetry python3 -m venv .venv

1 Dec 13, 2021