🧬 Performant Evolutionary Algorithms For Python with Ray support

Overview

🧬 Ruck 🏉

Performant evolutionary algorithms for Python

license python versions pypi version tests status

Visit the examples to get started, or browse the releases.

Contributions are welcome!


Goals

Ruck aims to fill the following criteria:

  1. Provide high quality, readable implementations of algorithms.
  2. Be easily extensible and debuggable.
  3. Performant while maintaining its simplicity.

Citing Ruck

Please use the following citation if you use Ruck in your research:

@Misc{Michlo2021Ruck,
  author =       {Nathan Juraj Michlo},
  title =        {Ruck - Performant evolutionary algorithms for Python},
  howpublished = {Github},
  year =         {2021},
  url =          {https://github.com/nmichlo/ruck}
}

Overview

Ruck takes inspiration from PyTorch Lightning's module system. The population creation, offspring, evaluation and selection steps are all contained within a single module inheriting from EaModule. While the training logic and components are separated into its own class.

Members of a Population (A list of Members) are intended to be read-only. Modifications should not be made to members, instead new members should be created with the modified values. This enables us to easily implement efficient multi-threading, see below!

The trainer automatically constructs HallOfFame and LogBook objects which keep track of your population and offspring. EaModule provides defaults for get_stats_groups and get_progress_stats that can be overridden if you wish to customize the tracked statistics and statistics displayed by tqdm.

Minimal OneMax Example

import random
import numpy as np
from ruck import *


class OneMaxMinimalModule(EaModule):
    """
    Minimal onemax example
    - The goal is to flip all the bits of a boolean array to True
    - Offspring are generated as bit flipped versions of the previous population
    - Selection tournament is performed between the previous population and the offspring
    """

    # evaluate unevaluated members according to their values
    def evaluate_values(self, values):
        return [v.sum() for v in values]

    # generate 300 random members of size 100 with 50% bits flipped
    def gen_starting_values(self):
        return [np.random.random(100) < 0.5 for _ in range(300)]

    # randomly flip 5% of the bits of each each member in the population
    # the previous population members should never be modified
    def generate_offspring(self, population):
        return [Member(m.value ^ (np.random.random(m.value.shape) < 0.05)) for m in population]

    # selection tournament between population and offspring
    def select_population(self, population, offspring):
        combined = population + offspring
        return [max(random.sample(combined, k=3), key=lambda m: m.fitness) for _ in range(len(population))]


if __name__ == '__main__':
    # create and train the population
    module = OneMaxMinimalModule()
    pop, logbook, halloffame = Trainer(generations=100, progress=True).fit(module)

    print('initial stats:', logbook[0])
    print('final stats:', logbook[-1])
    print('best member:', halloffame.members[0])

Advanced OneMax Example

Ruck provides various helper functions and implementations of evolutionary algorithms for convenience. The following example makes use of these additional features so that components and behaviour can easily be swapped out.

The three basic evolutionary algorithms provided are based around deap's default algorithms from deap.algorithms. These basic evolutionary algorithms can be created from ruck.functional.make_ea. We provide the alias ruck.R for ruck.functional. R.make_ea supports the following modes: simple, mu_plus_lambda and mu_comma_lambda.

Code Example

Population: return [ np.random.random(self.hparams.member_size) < 0.5 for i in range(self.hparams.population_size) ] if __name__ == '__main__': # create and train the population module = OneMaxModule(population_size=300, member_size=100) pop, logbook, halloffame = Trainer(generations=40, progress=True).fit(module) print('initial stats:', logbook[0]) print('final stats:', logbook[-1]) print('best member:', halloffame.members[0]) ">
"""
OneMax serial example based on:
https://github.com/DEAP/deap/blob/master/examples/ga/onemax_numpy.py
"""

import functools
import numpy as np
from ruck import *


class OneMaxModule(EaModule):

    def __init__(
        self,
        population_size: int = 300,
        offspring_num: int = None,  # offspring_num (lambda) is automatically set to population_size (mu) when `None`
        member_size: int = 100,
        p_mate: float = 0.5,
        p_mutate: float = 0.5,
        ea_mode: str = 'simple'
    ):
        # save the arguments to the .hparams property. values are taken from the
        # local scope so modifications can be captured if the call to this is delayed.
        self.save_hyperparameters()
        # implement the required functions for `EaModule`
        self.generate_offspring, self.select_population = R.make_ea(
            mode=self.hparams.ea_mode,
            offspring_num=self.hparams.offspring_num,
            mate_fn=R.mate_crossover_1d,
            mutate_fn=functools.partial(R.mutate_flip_bit_groups, p=0.05),
            select_fn=functools.partial(R.select_tournament, k=3),
            p_mate=self.hparams.p_mate,
            p_mutate=self.hparams.p_mutate,
        )

    def evaluate_values(self, values):
        return map(np.sum, values)

    def gen_starting_values(self) -> Population:
        return [
            np.random.random(self.hparams.member_size) < 0.5
            for i in range(self.hparams.population_size)
        ]


if __name__ == '__main__':
    # create and train the population
    module = OneMaxModule(population_size=300, member_size=100)
    pop, logbook, halloffame = Trainer(generations=40, progress=True).fit(module)

    print('initial stats:', logbook[0])
    print('final stats:', logbook[-1])
    print('best member:', halloffame.members[0])

Multithreading OneMax Example (Ray)

If we need to scale up the computational requirements, for example requiring increased member and population sizes, the above serial implementations will soon run into performance problems.

The basic Ruck implementations of various evolutionary algorithms are designed around a map function that can be swapped out to add multi-threading support. We can easily do this using ray and we even provide various helper functions that enhance ray support.

  1. We begin by placing member's values into shared memory using ray's read-only object store and the ray.put function. These ObjectRef's values point to the original np.ndarray values. When retrieved with ray.get they obtain the original arrays using an efficient zero-copy procedure. This is advantageous over something like Python's multiprocessing module which uses expensive pickle operations to pass data around.

  2. The second step is to swap out the aforementioned map function in the previous example to a multiprocessing equivalent. We use ray.remote along with ray.get, and provide the ray_map function that has the same API as python map, but accepts ray.remote(my_fn).remote values instead.

  3. Finally we need to update our mate and mutate functions to handle ObjectRefs, we provide a convenient wrapper to automatically call ray.put on function results so that you can re-use your existing code.

Code Example

"""
OneMax parallel example using ray's object store.

8 bytes * 1_000_000 * 128 members ~= 128 MB of memory to store this population.
This is quite a bit of processing that needs to happen! But using ray
and its object store we can do this efficiently!
"""

from functools import partial
import numpy as np
from ruck import *
from ruck.external.ray import *


class OneMaxRayModule(EaModule):

    def __init__(
        self,
        population_size: int = 300,
        offspring_num: int = None,  # offspring_num (lambda) is automatically set to population_size (mu) when `None`
        member_size: int = 100,
        p_mate: float = 0.5,
        p_mutate: float = 0.5,
        ea_mode: str = 'mu_plus_lambda'
    ):
        self.save_hyperparameters()
        # implement the required functions for `EaModule`
        self.generate_offspring, self.select_population = R.make_ea(
            mode=self.hparams.ea_mode,
            offspring_num=self.hparams.offspring_num,
            # decorate the functions with `ray_remote_put` which automatically
            # `ray.get` arguments that are `ObjectRef`s and `ray.put`s returned results
            mate_fn=ray_remote_puts(R.mate_crossover_1d).remote,
            mutate_fn=ray_remote_put(R.mutate_flip_bit_groups).remote,
            # efficient to compute locally
            select_fn=partial(R.select_tournament, k=3),
            p_mate=self.hparams.p_mate,
            p_mutate=self.hparams.p_mutate,
            # ENABLE multiprocessing
            map_fn=ray_map,
        )
        # eval function, we need to cache it on the class to prevent
        # multiple calls to ray.remote. We use ray.remote instead of
        # ray_remote_put like above because we want the returned values
        # not object refs to those values.
        self._ray_eval = ray.remote(np.mean).remote

    def evaluate_values(self, values):
        # values is a list of `ray.ObjectRef`s not `np.ndarray`s
        # ray_map automatically converts np.sum to a `ray.remote` function which
        # automatically handles `ray.get`ing of `ray.ObjectRef`s passed as arguments
        return ray_map(self._ray_eval, values)

    def gen_starting_values(self):
        # generate objects and place in ray's object store
        return [
            ray.put(np.random.random(self.hparams.member_size) < 0.5)
            for i in range(self.hparams.population_size)
        ]


if __name__ == '__main__':
    # initialize ray to use the specified system resources
    ray.init()

    # create and train the population
    module = OneMaxRayModule(population_size=128, member_size=1_000_000)
    pop, logbook, halloffame = Trainer(generations=200, progress=True).fit(module)

    print('initial stats:', logbook[0])
    print('final stats:', logbook[-1])
    print('best member:', halloffame.members[0])

You might also like...
Implementation of Apriori algorithms via Python

Installing run bellow command for installing all packages pip install -r requirements.txt Data Put csv data under this directory "infrastructure/data

A simple python application to visualize sorting algorithms.
A simple python application to visualize sorting algorithms.

Visualize sorting algorithms A simple python application to visualize sorting algorithms. Sort Algorithms Name Function Name O( ) Bubble Sort bubble_s

Programming Foundations Algorithms With Python
Programming Foundations Algorithms With Python

Programming-Foundations-Algorithms Algorithms purpose to solve a specific proplem with a sequential sets of steps for instance : if you need to add di

Repository for Comparison based sorting algorithms in python

Repository for Comparison based sorting algorithms in python. This was implemented for project one submission for ITCS 6114 Data Structures and Algorithms under the guidance of Dr. Dewan at the University of North Carolina at Charlotte, Fall 2021.

Python Client for Algorithmia Algorithms and Data API

Algorithmia Common Library (python) Python client library for accessing the Algorithmia API For API documentation, see the PythonDocs Algorithm Develo

Algorithms and data structures for educational, demonstrational and experimental purposes.

Algorithms and Data Structures (ands) Introduction This project was created for personal use mostly while studying for an exam (starting in the month

This is the code repository for 40 Algorithms Every Programmer Should Know , published by Packt.
This is the code repository for 40 Algorithms Every Programmer Should Know , published by Packt.

40 Algorithms Every Programmer Should Know, published by Packt

Solving a card game with three search algorithms: BFS, IDS, and A*

Search Algorithms Overview In this project, we want to solve a card game with three search algorithms. In this card game, we have to sort our cards by

Nature-inspired algorithms are a very popular tool for solving optimization problems.
Nature-inspired algorithms are a very popular tool for solving optimization problems.

Nature-inspired algorithms are a very popular tool for solving optimization problems. Numerous variants of nature-inspired algorithms have been develo

Comments
  • MANIFEST.in

    MANIFEST.in

    I'm trying to get this project onto Conda Forge, to facilitate that, is it possible to include an MANIFEST.in file to enumerate the sdist? https://packaging.python.org/guides/using-manifest-in/

    enhancement good first issue question 
    opened by thewchan 1
  • [FEATURE]: Multi-Threading & Algorithm Helpers/Factories

    [FEATURE]: Multi-Threading & Algorithm Helpers/Factories

    Currently all algorithms extend from EaModule, often various similarities exist in implementations and code is repeated.

    Helper functions, subclasses of EaModule or maybe even mixins might be worth investigating.

    eg.

    class Problem(EaModuleRay):
        ... 
        
    # OR
    
    class Problem(EaModule, EaRayMixin):
        ...
    

    I haven't really pondered too much about these but they may be worthwhile?

    enhancement 
    opened by nmichlo 0
Releases(v0.2.4)
  • v0.2.4(Oct 20, 2021)

    Additions

    • Functionally equivalent version of NSGA-ii implemented at ruck.functional.select_nsga2
      • if numba is installed then the function will be JIT compiled for up to 65x faster performance
      • minor differences exist compared to ruck.external.deap.select_nsga2, but overall results should be similar.

    Deprecations

    • ruck.external.deap.select_nsga2 has been deprecated and will be removed in v0.3.0
    Source code(tar.gz)
    Source code(zip)
  • v0.2.3(Oct 17, 2021)

  • v0.2.2(Sep 27, 2021)

    Breaking change if you are using the ray helper functions that allows us to remove the optional dependencies. The core ruck API remains the same.

    • ruck.util._ray moved to ruck.external.ray
    • added ruck.external.deap with select_nsga2 that uses deep behind the scenes. The goal is to implement the ourselves in the near future. Currently it is a bit slow, and means me need extra deps.
    • More examples!
    Source code(tar.gz)
    Source code(zip)
  • v0.2.1(Sep 25, 2021)

  • v0.2.0(Sep 25, 2021)

    Changes

    • Ray remote function support was broken because of algorithm design decisions and use of nested functions and decorators. Updated to remove these limitations. API changes slightly.
    • R.mate_crossover_nd added to compliment R.mate_crossover_1d
    • Training checks that all the values in a population maintain the same type, this can help avoid ray multithreading errors when object refs are required instead of values.
    Source code(tar.gz)
    Source code(zip)
  • v0.1.0(Sep 25, 2021)

  • v0.1.0.dev1(Sep 25, 2021)

Owner
Nathan
Master's student with interests in representation and reinforcement learning.
Nathan
Python algorithm to determine the optimal elevation threshold of a GNSS receiver, by using a statistical test known as the Brown-Forsynthe test.

Levene and Brown-Forsynthe: Test for variances Application to Global Navigation Satellite Systems (GNSS) Python algorithm to determine the optimal ele

Nicolas Gachancipa 2 Aug 09, 2022
Using Bayesian, KNN, Logistic Regression to classify spam and non-spam.

Make Sure the dataset file "spamData.mat" is in the folder spam\src Environment: Python --version = 3.7 Third Party: numpy, matplotlib, math, scipy

0 Dec 26, 2021
RRT algorithm and its optimization

RRT-Algorithm-Visualisation This is a project that aims to develop upon the RRT

Sarannya Bhattacharya 7 Mar 06, 2022
Esse repositório tem como finalidade expor os trabalhos feitos para disciplina de Algoritmos computacionais e estruturais do CEFET-RJ no ano letivo de 2021.

Exercícios de Python 🐍 Esse repositório tem como finalidade expor os trabalhos feitos para disciplina de Algoritmos computacionais e estruturais do C

Rafaela Bezerra de Figueiredo 1 Nov 20, 2021
Provide player's names and mmr and generate mathematically balanced teams

Lollo's matchmaking algorithm Provide player's names and mmr and generate mathematically balanced teams How to use Fill the input.json file with your

4 Aug 04, 2022
A Python project for optimizing the 8 Queens Puzzle using the Genetic Algorithm implemented in PyGAD.

8QueensGenetic A Python project for optimizing the 8 Queens Puzzle using the Genetic Algorithm implemented in PyGAD. The project uses the Kivy cross-p

Ahmed Gad 16 Nov 13, 2022
Leveraging Unique CPS Properties to Design Better Privacy-Enhancing Algorithms

Differential_Privacy_CPS Python implementation of the research paper Leveraging Unique CPS Properties to Design Better Privacy-Enhancing Algorithms Re

Shubhesh Anand 2 Dec 14, 2022
This repository is not maintained

This repository is no longer maintained, but is being kept around for educational purposes. If you want a more complete algorithms repo check out: htt

Nic Young 2.8k Dec 30, 2022
A simple python implementation of A* and bfs algorithm solving Eight-Puzzle

A simple python implementation of A* and bfs algorithm solving Eight-Puzzle

2 May 22, 2022
This project consists of a collaborative filtering algorithm to predict movie reviews ratings from a dataset of Netflix ratings.

Collaborative Filtering - Netflix movie reviews Description This project consists of a collaborative filtering algorithm to predict movie reviews rati

Shashank Kumar 1 Dec 21, 2021
The test data, code and detailed description of the AW t-SNE algorithm

AW-t-SNE The test data, code and result of the AW t-SNE algorithm Structure of the folder Datasets: This folder contains two datasets, the MNIST datas

1 Mar 09, 2022
frePPLe - open source supply chain planning

frePPLe Open source supply chain planning FrePPLe is an easy-to-use and easy-to-implement open source advanced planning and scheduling tool for manufa

frePPLe 385 Jan 06, 2023
A litle algorithm that i made for transform a picture in a spreadsheet.

PicsToSheets How it works? It is an algorithm designed to transform an image into a spreadsheet file. this converts image pixels to color cells of she

Guilherme de Oliveira 1 Nov 12, 2021
Fedlearn algorithm toolkit for researchers

Fedlearn algorithm toolkit for researchers

89 Nov 14, 2022
Python based framework providing a simple and intuitive framework for algorithmic trading

Harvest is a Python based framework providing a simple and intuitive framework for algorithmic trading. Visit Harvest's website for details, tutorials

100 Jan 03, 2023
A calculator to test numbers against the collatz conjecture

The Collatz Calculator This is an algorithm custom built by Kyle Dickey, used to test numbers against the simple rules of the Collatz Conjecture. Get

Kyle Dickey 2 Jun 14, 2022
Multiple Imputation with Random Forests in Python

miceforest: Fast, Memory Efficient Imputation with lightgbm Fast, memory efficient Multiple Imputation by Chained Equations (MICE) with lightgbm. The

Samuel Wilson 202 Dec 31, 2022
It is a platform that implements some path planning algorithms.

PathPlanningAlgorithms It is a platform that implements some path planning algorithms. Main dependence: python3.7, opencv4.1.1.26 (for image show) Tip

5 Feb 24, 2022
This is a demo for AAD algorithm.

Asynchronous-Anisotropic-Diffusion-Algorithm This is a demo for AAD algorithm. The subroutine of the anisotropic diffusion algorithm is modified from

3 Mar 21, 2022