onelearn: Online learning in Python

Last update: Nov 06, 2022

Overview

onelearn: Online learning in Python

Documentation | Reproduce experiments |

onelearn stands for ONE-shot LEARNning. It is a small python package for online learning with Python. It provides :

online (or one-shot) learning algorithms: each sample is processed once, only a single pass is performed on the data
including multi-class classification and regression algorithms
For now, only ensemble methods, namely Random Forests

Installation

The easiest way to install onelearn is using pip

pip install onelearn

But you can also use the latest development from github directly with

pip install git+https://github.com/onelearn/onelearn.git

References

@article{mourtada2019amf,
  title={AMF: Aggregated Mondrian Forests for Online Learning},
  author={Mourtada, Jaouad and Ga{\"\i}ffas, St{\'e}phane and Scornet, Erwan},
  journal={arXiv preprint arXiv:1906.10529},
  year={2019}
}

Comments

Unable to pickle AMFClassifier.
I would like to save the AMFClassifier, but am unable to pickle it. I have also tried to use dill or joblib, but they also don't seem to work.

Is there maybe another way to somehow export the AMFClassifier in any way, such that I can save it and load it in another kernel?

Below I added a snippet of code which reproduces the error. Note that only after the partial_fit method an error occurs when pickling. When the AMFClassifier has not been fit yet, pickling happens without problems, however, exporting an empty model is pretty useless.

Any help or tips is much appreciated.

from onelearn import AMFClassifier import dill as pickle from sklearn import datasets iris = datasets.load_iris() X = iris.data y = iris.target amf = AMFClassifier(n_classes=3) dump = pickle.dumps(amf) amf = pickle.loads(dump) amf.partial_fit(X,y) dump = pickle.dumps(amf) amf = pickle.loads(dump)
opened by w-feijen 1
Move experiments of the paper in a experiments folder
Update the documentation

Explain that we must clone the repo

Move also the short experiments to a examples folder and build a sphinx gallery with it
enhancement
opened by stephanegaiffas 1
Add some extra tests
Test that batch versus online training leads to the exact same forest

Test the behavior of reserve_samples, with several calls to partial_fit to check that memory is correctly allocated and

tests
opened by stephanegaiffas 1
What if predict_proba receives a single sample

get_amf_decision_online amf.partial_fit(X_train[iteration - 1], y_train[iteration - 1]) File "/Users/stephanegaiffas/Code/onelearn/onelearn/forest.py", line 259, in partial_fit n_samples, n_features = X.shape

opened by stephanegaiffas 1
Improve coverage

A problem is that @jit functions don't work with coverage... a workaround is to disable using the NUMBA_DISABLE_JIT environment variable, but breaks the code that use @jitclass and .class_type.instance_type attributes
enhancement bug fix

opened by stephanegaiffas 1

Releases(v0.3)

v0.3(Sep 29, 2021)
This release adds the following improvements

AMFClassifier and AMFRegressor can be serialized to files (using internally pickle) using the save and load methods

Source code(tar.gz)
Source code(zip)
v0.2.0(Apr 6, 2020)
This release adds the following improvements

SampleCollection pre-allocates more samples instead of the bare minimum for faster computation

The playground can be launched from the library

A documentation on readthedocs

Faster computations and a lot of code cleaning

Unittests for python 3.6-3.8

Source code(tar.gz)
Source code(zip)

Owner

GitHub Repository https://onelearn.readthedocs.io

AP1 Transcription Factor Binding Site Prediction

A machine learning project that predicted binding sites of AP1 transcription factor, using ChIP-Seq data and local DNA shape information.

1 Jan 21, 2022

a distributed deep learning platform

Apache SINGA Distributed deep learning system http://singa.apache.org Quick Start Installation Examples Issues JIRA tickets Code Analysis: Mailing Lis

2.7k Jan 05, 2023

Merlion: A Machine Learning Framework for Time Series Intelligence

Merlion is a Python library for time series intelligence. It provides an end-to-end machine learning framework that includes loading and transforming data, building and training models, post-processi

2.8k Jan 05, 2023

FLAML is a lightweight Python library that finds accurate machine learning models automatically, efficiently and economically

FLAML - Fast and Lightweight AutoML

2.2k Jan 09, 2023

Implementation of deep learning models for time series in PyTorch.

List of Implementations: Currently, the reimplementation of the DeepAR paper(DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks

275 Dec 28, 2022

A Tools that help Data Scientists and ML engineers train and deploy ML models.

Domino Research This repo contains projects under active development by the Domino R&D team. We build tools that help Data Scientists and ML engineers

73 Oct 17, 2022

Timeseries analysis for neuroscience data

=================================================== Nitime: timeseries analysis for neuroscience data ===============================================

212 Dec 09, 2022

Climin is a Python package for optimization, heavily biased to machine learning scenarios

climin climin is a Python package for optimization, heavily biased to machine learning scenarios distributed under the BSD 3-clause license. It works

177 Sep 02, 2022

Laporan Proyek Machine Learning - Azhar Rizki Zulma

Laporan Proyek Machine Learning - Azhar Rizki Zulma Project Overview Domain proyek yang dipilih dalam proyek machine learning ini adalah mengenai hibu

6 Mar 12, 2022

This is the material used in my free Persian course: Machine Learning with Python

4 Aug 07, 2022

AtsPy: Automated Time Series Models in Python (by @firmai)

Automated Time Series Models in Python (AtsPy) SSRN Report Easily develop state of the art time series models to forecast univariate data series. Simp

465 Jan 02, 2023

A simple machine learning python sign language detection project.

SST Coursework 2022 About the app A python application that utilises the tensorflow object detection algorithm to achieve automatic detection of ameri

2 Jun 30, 2022

Pydantic based mock data generation

This library offers powerful mock data generation capabilities for pydantic based models. It can also be used with other libraries that use pydantic as a foundation, for example SQLModel, Beanie and

396 Dec 28, 2022

Tools for Optuna, MLflow and the integration of both.

HPOflow - Sphinx DOC Tools for Optuna, MLflow and the integration of both. Detailed documentation with examples can be found here: Sphinx DOC Table of

17 Nov 20, 2022

Anytime Learning At Macroscale

On Anytime Learning At Macroscale Learning from sequential data dumps (key) Requirements Python 3.7 Pytorch 1.9.0 Hydra 1.1.0 (pip install hydra-core

8 Mar 29, 2022

Time series forecasting with PyTorch

Our article on Towards Data Science introduces the package and provides background information. Pytorch Forecasting aims to ease state-of-the-art time

2.5k Jan 02, 2023

Given the names and grades for each student in a class N of students, store them in a nested list and print the name(s) of any student(s) having the second lowest grade.

Hackerank-Nested-List Given the names and grades for each student in a class N of students, store them in a nested list and print the name(s) of any s

2 Dec 14, 2021

onelearn: Online learning in Python

Related tags

Overview

onelearn: Online learning in Python

Installation

References

Comments

Unable to pickle AMFClassifier.

Move experiments of the paper in a experiments folder

Add some extra tests

What if predict_proba receives a single sample

Improve coverage

Releases(v0.3)

v0.3(Sep 29, 2021)

v0.2.0(Apr 6, 2020)

Owner

AP1 Transcription Factor Binding Site Prediction

a distributed deep learning platform

Merlion: A Machine Learning Framework for Time Series Intelligence

FLAML is a lightweight Python library that finds accurate machine learning models automatically, efficiently and economically

Implementation of deep learning models for time series in PyTorch.

A Tools that help Data Scientists and ML engineers train and deploy ML models.

Timeseries analysis for neuroscience data

Climin is a Python package for optimization, heavily biased to machine learning scenarios

Laporan Proyek Machine Learning - Azhar Rizki Zulma

This is the material used in my free Persian course: Machine Learning with Python

AtsPy: Automated Time Series Models in Python (by @firmai)

A simple machine learning python sign language detection project.

Pydantic based mock data generation

Tools for Optuna, MLflow and the integration of both.

Anytime Learning At Macroscale

Time series forecasting with PyTorch

Given the names and grades for each student in a class N of students, store them in a nested list and print the name(s) of any student(s) having the second lowest grade.

A data preprocessing and feature engineering script for a machine learning pipeline is prepared.

Bayesian optimization in JAX

A python library for easy manipulation and forecasting of time series.