Python Research Framework

Related tags

Machine Learningpyfra
Overview

pyfra

The Python Research Framework.

Design Philosophy

Research code has some of the fastest shifting requirements of any type of code. It's nearly impossible to plan ahead of time the proper abstractions, because it is exceedingly likely that in the course of the project what you originally thought was your main focus suddenly no longer is. Further, research code (especially in ML) often involves big and complicated pipelines, typically involving many different machines, which are either run by hand or using shell scripts that are far more complicated than any shell script ever should be.

Therefore, the objective of pyfra is to make it as fast and low-friction as possible to write research code involving complex pipelines over many machines. This entails making it as easy as possible to implement a research idea in reality, at the cost of fine-grained control and the long-term maintainability of the system. In other words, pyfra expects that code will either be rapidly obsoleted by newer code, or rewritten using some other framework once it is no longer a research project and requirements have settled down.

Pyfra is in its very early stages of development. The interface may change rapidly and without warning.

Features:

  • Spin up an internal webserver complete with a permissions system using only a few lines of code
  • Extremely elegant shell integration—run commands on any server seamlessly. All the best parts of bash and python combined
  • Automated remote environment setup, so you never have to worry about provisioning machines by hand again
  • (WIP) Tools for painless functional programming in python
  • (Coming soon) High level API for experiment management/scheduling and resource provisioning
  • (Coming soon) Idempotent resumable data pipelines with no cognitive overhead

Example code

from pyfra import *

loc = Remote()
rem = Remote("[email protected]")
nas = Remote("[email protected]")

@page("Run experiment", dropdowns={'server': ['local', 'remote']})
def run_experiment(server: str, config_file: str, some_numerical_value: int, some_checkbox: bool):
    r = loc if server == 'local' else rem

    r.sh("git clone https://github.com/EleutherAI/gpt-neox")
    
    # rsync as a function can do local-local, local-remote, and remote-remote
    rsync(config_file, r.file("gpt-neox/configs/my-config.yml"))
    rsync(nas.file('some_data_file'), r.file('gpt-neox/data/whatever'))
    
    return r.sh('cd gpt-neox; python3 main.py')

@page("Write example file and copy")
def example():
    rem.fwrite("testing.txt", "hello world")
    
    # tlocal files can be specified as just a string
    rsync(rem.file('testing123.txt'), 'test1.txt')
    rsync(rem.file('testing123.txt'), loc.file('test2.txt'))

    loc.sh('cat test1.txt')
    
    assert fread('test1.txt') == fread('test2.txt')
    
    # fread, fwrite, etc can take a `rem.file` instead of a string filename.
    # you can also use all *read and *write functions directly on the remote too.
    assert fread('test1.txt') == fread(rem.file('testing123.txt'))
    assert fread('test1.txt') == rem.fread('testing123.txt')

    # ls as a function returns a list of files (with absolute paths) on the selected remote.
    # the returned value is displayed on the webpage.
    return '\n'.join(rem.ls('/'))

@page("List files in some directory")
def list_files(directory):
    return sh(f"ls -la {directory | quote}")


# start internal webserver
webserver()

Installation

pip3 install git+https://github.com/EleutherAI/pyfra/

The version of PyPI is not up to date, do not use it.

Webserver screenshots

image image

Comments
  • Try to install sudo in _install

    Try to install sudo in _install

    Sudo is installed in setup.apt(), which is not run when python_version=None is set for an env. This PR tries to install the sudo package on _install which solves this issue.

    opened by kurumuz 1
  • Styling updates 2

    Styling updates 2

    This should fix some issues that were noticed recently.

    • increases the width of the content in the middle
    • all button icons are now the same (until we figure out better solution)
    • content that is overflowing should now be scrollable
    opened by jprester 0
  • Update styling

    Update styling

    I made some updates to styling for the admin dashboard pages.

    Stuff I did:

    • changed the styling to look like design mockup
    • moved ids to classes in css. Ids should be used for javascript selector
    • added some svg icons
    • made the UI somewhat responsive
    opened by jprester 0
  • docs: docs are empty

    docs: docs are empty

    Screenshot from the RTD page:

    image

    I recommend checking the raw output of the build on the RTD dashboard.

    Probably some library installation issue when running setup.

    opened by TomFrederik 0
  • Type annotations

    Type annotations

    Type annotations are a must-have for public facing library exports, as they allow users to infer a lot of information about calls/return values independent of documentation, as well as help with code completions.

    opened by hugbubby 0
Releases(v0.3.0)
  • v0.3.0(Dec 9, 2021)

    What's new

    • Envs now resume where they left off (and Remotes have an option for turning this behaviour on)
    • @stage caching added

    Breaking Changes

    • delegation promoted to full submodule and experiment removed
    • pyfra.functional removed
    • pyfra.web deprecated and moved to contrib
    • contrib revamp

    Full Changelog: https://github.com/EleutherAI/pyfra/compare/8e775df36ca8f2ae39b0b7add9c30eab446207b1...9616e835578f8ad04a6d9c3b405777fc4b7e0853

    Source code(tar.gz)
    Source code(zip)
  • v0.3.0rc6(Sep 1, 2021)

Owner
EleutherAI
EleutherAI
CrayLabs and user contibuted examples of using SmartSim for various simulation and machine learning applications.

SmartSim Example Zoo This repository contains CrayLabs and user contibuted examples of using SmartSim for various simulation and machine learning appl

Cray Labs 14 Mar 30, 2022
A simple python program that draws a tree for incrementing values using the Collatz Conjecture.

Collatz Conjecture A simple python program that draws a tree for incrementing values using the Collatz Conjecture. Values which can be edited: Length

davidgasinski 1 Oct 28, 2021
A modular active learning framework for Python

Modular Active Learning framework for Python3 Page contents Introduction Active learning from bird's-eye view modAL in action From zero to one in a fe

modAL 1.9k Dec 31, 2022
🎛 Distributed machine learning made simple.

🎛 lazycluster Distributed machine learning made simple. Use your preferred distributed ML framework like a lazy engineer. Getting Started • Highlight

Machine Learning Tooling 44 Nov 27, 2022
monolish: MONOlithic Liner equation Solvers for Highly-parallel architecture

monolish is a linear equation solver library that monolithically fuses variable data type, matrix structures, matrix data format, vendor specific data transfer APIs, and vendor specific numerical alg

RICOS Co. Ltd. 179 Dec 21, 2022
Machine learning algorithms implementation

Machine learning algorithms implementation This repository consisits of implementation of various machine learning algorithms. The algorithms implemen

Karun Dawadi 1 Jan 03, 2022
PyTorch extensions for high performance and large scale training.

Description FairScale is a PyTorch extension library for high performance and large scale training on one or multiple machines/nodes. This library ext

Facebook Research 2k Dec 28, 2022
ETNA is an easy-to-use time series forecasting framework.

ETNA is an easy-to-use time series forecasting framework. It includes built in toolkits for time series preprocessing, feature generation, a variety of predictive models with unified interface - from

Tinkoff.AI 674 Jan 07, 2023
This is the code repository for Interpretable Machine Learning with Python, published by Packt.

Interpretable Machine Learning with Python, published by Packt

Packt 299 Jan 02, 2023
slim-python is a package to learn customized scoring systems for decision-making problems.

slim-python is a package to learn customized scoring systems for decision-making problems. These are simple decision aids that let users make yes-no p

Berk Ustun 37 Nov 02, 2022
Optimal Randomized Canonical Correlation Analysis

ORCCA Optimal Randomized Canonical Correlation Analysis This project is for the python version of ORCCA algorithm. It depends on Numpy for matrix calc

Yinsong Wang 1 Nov 21, 2021
Reggy - Regressions with arbitrarily complex regularization terms

reggy Regressions with arbitrarily complex regularization terms. Currently suppo

Kim 1 Jan 20, 2022
CorrProxies - Optimizing Machine Learning Inference Queries with Correlative Proxy Models

CorrProxies - Optimizing Machine Learning Inference Queries with Correlative Proxy Models

ZhihuiYangCS 8 Jun 07, 2022
Library of Stan Models for Survival Analysis

survivalstan: Survival Models in Stan author: Jacki Novik Overview Library of Stan Models for Survival Analysis Features: Variety of standard survival

Hammer Lab 122 Jan 06, 2023
ml4ir: Machine Learning for Information Retrieval

ml4ir: Machine Learning for Information Retrieval | changelog Quickstart → ml4ir Read the Docs | ml4ir pypi | python ReadMe ml4ir is an open source li

Salesforce 77 Jan 06, 2023
To design and implement the Identification of Iris Flower species using machine learning using Python and the tool Scikit-Learn.

To design and implement the Identification of Iris Flower species using machine learning using Python and the tool Scikit-Learn.

Astitva Veer Garg 1 Jan 11, 2022
Exemplary lightweight and ready-to-deploy machine learning project

Exemplary lightweight and ready-to-deploy machine learning project

snapADDY GmbH 6 Dec 20, 2022
Fourier-Bayesian estimation of stochastic volatility models

fourier-bayesian-sv-estimation Fourier-Bayesian estimation of stochastic volatility models Code used to run the numerical examples of "Bayesian Approa

15 Jun 20, 2022
pandas, scikit-learn, xgboost and seaborn integration

pandas, scikit-learn and xgboost integration.

299 Dec 30, 2022
Gaussian Process Optimization using GPy

End of maintenance for GPyOpt Dear GPyOpt community! We would like to acknowledge the obvious. The core team of GPyOpt has moved on, and over the past

Sheffield Machine Learning Software 847 Dec 19, 2022