CrayLabs and user contibuted examples of using SmartSim for various simulation and machine learning applications.

Overview

SmartSim Example Zoo

This repository contains CrayLabs and user contibuted examples of using SmartSim for various simulation and machine learning applications.

The CrayLabs team will attempt to keep examples updated with current releases but all user contibuted examples should specify the release they were created with.

Contibuting Examples

We welcome any and all contibutions to this repository. The CrayLabs team will do their best to review in a timely manner. We ask that, if you contribute examples, please include a description and all references to code and relavent previous implemenations or open source code that the work is based off of for the benefit of anyone who would like to try out your example.

Examples by Paper

The following examples are implemented based on existing research papers. Each example lists the paper, previous works, and links to the implementation (possibly stored within this repository or a seperate repository)

1. DeepDriveMD

  • Contibuting User: CrayLabs
  • Tags: OpenMM, CVAE, online inference, unsupervised online learning, PyTorch, ensemble

This use case highlights many features of SmartSim and SmartRedis and together they can be used to orchestrate complex workflows with coupled applications without using the filesystem for exchanging information.

More specifically, this use case is based on the original DeepDriveMD work. DeepDriveMD was furthered with an asynchronous streaming version. SmartSim extends the streaming implementation through the use of the SmartSim architecture. The main difference between the SmartSim implementation and the previous implementations, is that neither ML models, nor Molecular Dynamics (MD) intermediate results are stored on the file system. Additionally, the inference portion of the workflow takes place inside the database instead of a seperate task launched on the system.

2. TensorFlowFoam

  • Contributing User: CrayLabs
  • Tags: Online Inference, TensorFlow, OpenFOAM, supervised learning

This example shows how to use TensorFlow inside of OpenFOAM simulations using SmartSim.

More specifically, this SmartSim use case adapts the TensorFlowFoam work which utilized a deep neural network to predict steady-state turbulent viscosities of the Spalart-Allmaras (SA) model. This use case highlights that a machine learning model can be evaluated using SmartSim from within a simulation with minimal external library code. For the OpenFOAM use case herein, only four SmartRedis client API calls are needed to initialize a client connection, send tensor data for evaluation, execute the TensorFlow model, and retrieve the model inference result.

In general, this example provides a useful driver script for those looking to run OpenFOAM with SmartSim.

3. ML-EKE

  • Contributing User: CrayLabs
  • Tags: Online inference, MOM6, climate modeling, ensemble, parameterization replacement

This example was a collaboration between CrayLabs (HPE), NCAR, and the university of Victoria. Using SmartSim, this example shows how to run an ensemble of simulations all using the SmartSim architecture to replace a parameterization (MEKE) within each global ocean simulation (MOM6).

Paper Abstract:

We demonstrate the first climate-scale, numerical ocean simulations improved through distributed, online inference of Deep Neural Networks (DNN) using SmartSim. SmartSim is a library dedicated to enabling online analysis and Machine Learning (ML) for traditional HPC simulations. In this paper, we detail the SmartSim architecture and provide benchmarks including online inference with a shared ML model on heterogeneous HPC systems. We demonstrate the capability of SmartSim by using it to run a 12-member ensemble of global-scale, high-resolution ocean simulations, each spanning 19 compute nodes, all communicating with the same ML architecture at each simulation timestep. In total, 970 billion inferences are collectively served by running the ensemble for a total of 120 simulated years. Finally, we show our solution is stable over the full duration of the model integrations, and that the inclusion of machine learning has minimal impact on the simulation runtimes.

Since this is original research done by CrayLabs, there is no previous implementation.

Examples by Simulation Model

LAMMPS

SmartSim examples with LAMMPS which is a Molecular Dynamics simulation model.

1. Online Analysis of Atom Position

  • Contibuting User: CrayLabs
  • Tags: Molecular Dynamics, online analysis, visualizations.

LAMMPS has dump styles which are custom I/O methods that can be implmentated by users. CrayLabs implemented a SMARTSIM dump style which uses the SmartRedis clients to stream data to an Orchestrator database created by SmartSim.

Once the data is in the database, any application with a SmartRedis client can consume that data. For this example, we have a simple Python script that uses iPyVolume to plot the data every 100 iterations.

Examples by System

High Performance Computing Systems are a bit like snowflakes, they are all different. Since each one has their own quirks, some examples for specific and popular systems can be of benefit to new users.

National Center for Atmospheric Research (NCAR)

1. Cheyenne

  • Contibuting User: CrayLabs
  • implementation (this repo)
  • WLM: PBSPro
  • System: SGI 8600
  • CPU: intel
  • GPU: None

2. Casper

  • Contibuting user: @jedwards4b
  • Implementation (this repo)
  • WLM: PBSPro
  • GPU: Nvidia
  • CPU: Intel
  • SmartSim Version: 0.3.2
  • SmartRedis Version: 0.2.0

Oak Ridge National Lab

1. Summit

  • Contributing user: CrayLabs
  • implementation (this repo)
  • System:
  • OS: Red Hat Enterprise Linux (RHEL)
  • CPU: Power9
  • GPU: Nvidia V100
Owner
Cray Labs
Cray Labs
List of Data Science Cheatsheets to rule the world

Data Science Cheatsheets List of Data Science Cheatsheets to rule the world. Table of Contents Business Science Business Science Problem Framework Dat

Favio André Vázquez 11.7k Dec 30, 2022
Machine Learning Model to predict the payment date of an invoice when it gets created in the system.

Payment-Date-Prediction Machine Learning Model to predict the payment date of an invoice when it gets created in the system.

15 Sep 09, 2022
[DEPRECATED] Tensorflow wrapper for DataFrames on Apache Spark

TensorFrames (Deprecated) Note: TensorFrames is deprecated. You can use pandas UDF instead. Experimental TensorFlow binding for Scala and Apache Spark

Databricks 757 Dec 31, 2022
The MLOps is the process of continuous integration and continuous delivery of Machine Learning artifacts as a software product, keeping it inside a loop of Design, Model Development and Operations.

MLOps The MLOps is the process of continuous integration and continuous delivery of Machine Learning artifacts as a software product, keeping it insid

Maykon Schots 25 Nov 27, 2022
Banpei is a Python package of the anomaly detection.

Banpei Banpei is a Python package of the anomaly detection. Anomaly detection is a technique used to identify unusual patterns that do not conform to

Hirofumi Tsuruta 282 Jan 03, 2023
scikit-fem is a lightweight Python 3.7+ library for performing finite element assembly.

scikit-fem is a lightweight Python 3.7+ library for performing finite element assembly. Its main purpose is the transformation of bilinear forms into sparse matrices and linear forms into vectors.

Tom Gustafsson 297 Dec 13, 2022
A Streamlit demo to interactively visualize Uber pickups in New York City

Streamlit Demo: Uber Pickups in New York City A Streamlit demo written in pure Python to interactively visualize Uber pickups in New York City. View t

Streamlit 230 Dec 28, 2022
Covid-polygraph - a set of Machine Learning-driven fact-checking tools

Covid-polygraph, a set of Machine Learning-driven fact-checking tools that aim to address the issue of misleading information related to COVID-19.

1 Apr 22, 2022
Implementation of K-Nearest Neighbors Algorithm Using PySpark

KNN With Spark Implementation of KNN using PySpark. The KNN was used on two separate datasets (https://archive.ics.uci.edu/ml/datasets/iris and https:

Zachary Petroff 4 Dec 30, 2022
ml4ir: Machine Learning for Information Retrieval

ml4ir: Machine Learning for Information Retrieval | changelog Quickstart → ml4ir Read the Docs | ml4ir pypi | python ReadMe ml4ir is an open source li

Salesforce 77 Jan 06, 2023
Both social media sentiment and stock market data are crucial for stock price prediction

Relating-Social-Media-to-Stock-Movement-Public - We explore the application of Machine Learning for predicting the return of the stock by using the information of stock returns. A trading strategy ba

Vishal Singh Parmar 15 Oct 29, 2022
Probabilistic programming framework that facilitates objective model selection for time-varying parameter models.

Time series analysis today is an important cornerstone of quantitative science in many disciplines, including natural and life sciences as well as eco

Christoph Mark 129 Dec 24, 2022
DirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning.

DirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning. DirectML provides GPU acceleration for common machine learning tasks across a broad range of supported ha

Microsoft 1.1k Jan 04, 2023
Adaptive: parallel active learning of mathematical functions

adaptive Adaptive: parallel active learning of mathematical functions. adaptive is an open-source Python library designed to make adaptive parallel fu

741 Dec 27, 2022
Python library for multilinear algebra and tensor factorizations

scikit-tensor is a Python module for multilinear algebra and tensor factorizations

Maximilian Nickel 394 Dec 09, 2022
PyCaret is an open-source, low-code machine learning library in Python that automates machine learning workflows.

An open-source, low-code machine learning library in Python 🚀 Version 2.3.5 out now! Check out the release notes here. Official • Docs • Install • Tu

PyCaret 6.7k Jan 08, 2023
This project impelemented for midterm of the Machine Learning #Zoomcamp #Alexey Grigorev

MLProject_01 This project impelemented for midterm of the Machine Learning #Zoomcamp #Alexey Grigorev Context Dataset English question data set file F

Hadi Nakhi 1 Dec 18, 2021
Python/Sage Tool for deriving Scattering Matrices for WDF R-Adaptors

R-Solver A Python tools for deriving R-Type adaptors for Wave Digital Filters. This code is not quite production-ready. If you are interested in contr

8 Sep 19, 2022
Greykite: A flexible, intuitive and fast forecasting library

The Greykite library provides flexible, intuitive and fast forecasts through its flagship algorithm, Silverkite.

LinkedIn 1.4k Jan 15, 2022
scikit-learn is a python module for machine learning built on top of numpy / scipy

About scikit-learn is a python module for machine learning built on top of numpy / scipy. The purpose of the scikit-learn-tutorial subproject is to le

Gael Varoquaux 122 Dec 12, 2022