A library for pattern matching on symbolic expressions in Python.

Overview

MatchPy

MatchPy is a library for pattern matching on symbolic expressions in Python.

Work in progress

Latest version released on PyPi Latest version released via conda-forge Test coverage Build status of the master branch Documentation Status The Journal of Open Source Software Digital Object Identifier

Installation

MatchPy is available via PyPI, and for Conda via conda-forge. It can be installed with pip install matchpy or conda install -c conda-forge matchpy.

Overview

This package implements pattern matching in Python. Pattern matching is a powerful tool for symbolic computations, operating on symbolic expressions. Given a pattern and an expression (which is usually called subject), the goal of pattern matching is to find a substitution for all the variables in the pattern such that the pattern becomes the subject. As an example, consider the pattern f(x), where f is a function and x is a variable, and the subject f(a), where a is a constant symbol. Then the substitution that replaces x with a is a match. MatchPy supports associative and/or commutative function symbols, as well as sequence variables, similar to pattern matching in Mathematica.

A detailed example of how to use MatchPy can be found here.

MatchPy supports both one-to-one and many-to-one pattern matching. The latter makes use of similarities between patterns to efficiently find matches for multiple patterns at the same time.

A list of publications about MatchPy can be found below.

Expressions

Expressions are tree-like data structures, consisting of operations (functions, internal nodes) and symbols (constants, leaves):

>>> from matchpy import Operation, Symbol, Arity
>>> f = Operation.new('f', Arity.binary)
>>> a = Symbol('a')
>>> print(f(a, a))
f(a, a)

Patterns are expressions which may contain wildcards (variables):

>>> from matchpy import Pattern, Wildcard
>>> x = Wildcard.dot('x')
>>> print(Pattern(f(a, x)))
f(a, x_)

In the previous example, x is the name of the variable. However, it is also possible to use wildcards without names:

>>> w = Wildcard.dot()
>>> print(Pattern(f(w, w)))
f(_, _)

It is also possible to assign variable names to entire subexpressions:

>>> print(Pattern(f(w, a, variable_name='y')))
y: f(_, a)

Pattern Matching

Given a pattern and an expression (which is usually called subject), the idea of pattern matching is to find a substitution that maps wildcards to expressions such that the pattern becomes the subject. In MatchPy, a substitution is a dict that maps variable names to expressions.

>>> from matchpy import match
>>> y = Wildcard.dot('y')
>>> b = Symbol('b')
>>> subject = f(a, b)
>>> pattern = Pattern(f(x, y))
>>> substitution = next(match(subject, pattern))
>>> print(substitution)
{x ↦ a, y ↦ b}

Applying the substitution to the pattern results in the original expression.

>>> from matchpy import substitute
>>> print(substitute(pattern, substitution))
f(a, b)

Sequence Wildcards

Sequence wildcards are wildcards that can match a sequence of expressions instead of just a single expression:

>>> z = Wildcard.plus('z')
>>> pattern = Pattern(f(z))
>>> subject = f(a, b)
>>> substitution = next(match(subject, pattern))
>>> print(substitution)
{z ↦ (a, b)}

Associativity and Commutativity

MatchPy natively supports associative and/or commutative operations. Nested associative operators are automatically flattened, the operands in commutative operations are sorted:

>>> g = Operation.new('g', Arity.polyadic, associative=True, commutative=True)
>>> print(g(a, g(b, a)))
g(a, a, b)

Associativity and commutativity is also considered for pattern matching:

>>> pattern = Pattern(g(b, x))
>>> subject = g(a, a, b)
>>> print(next(match(subject, pattern)))
{x ↦ g(a, a)}
>>> h = Operation.new('h', Arity.polyadic)
>>> pattern = Pattern(h(b, x))
>>> subject = h(a, a, b)
>>> list(match(subject, pattern))
[]

Many-to-One Matching

When a fixed set of patterns is matched repeatedly against different subjects, matching can be sped up significantly by using many-to-one matching. The idea of many-to-one matching is to construct a so called discrimination net, a data structure similar to a decision tree or a finite automaton that exploits similarities between patterns. In MatchPy, there are two such data structures, implemented as classes: DiscriminationNet and ManyToOneMatcher. The DiscriminationNet class only supports syntactic pattern matching, that is, operations are neither associative nor commutative. Sequence variables are not supported either. The ManyToOneMatcher class supports associative and/or commutative matching with sequence variables. For syntactic pattern matching, the DiscriminationNet should be used, as it is usually faster.

>>> pattern1 = Pattern(f(a, x))
>>> pattern2 = Pattern(f(y, b))
>>> matcher = ManyToOneMatcher(pattern1, pattern2)
>>> subject = f(a, b)
>>> matches = matcher.match(subject)
>>> for matched_pattern, substitution in sorted(map(lambda m: (str(m[0]), str(m[1])), matches)):
...     print('{} matched with {}'.format(matched_pattern, substitution))
f(a, x_) matched with {x ↦ b}
f(y_, b) matched with {y ↦ a}

Roadmap

Besides the existing features, we plan on adding the following to MatchPy:

  • Support for Mathematica's Alternatives: For example f(a | b) would match either f(a) or f(b).
  • Support for Mathematica's Repeated: For example f(a..) would match f(a), f(a, a), f(a, a, a), etc.
  • Support pattern sequences (PatternSequence in Mathematica). These are mainly useful in combination with Alternatives or Repeated, e.g. f(a | (b, c)) would match either f(a) or f(b, c). f((a a)..) would match any f with an even number of a arguments.
  • All these additional pattern features need to be supported in the ManyToOneMatcher as well.
  • Better integration with existing types such as dict.
  • Code generation for both one-to-one and many-to-one matching. There is already an experimental implementation, but it still has some dependencies on MatchPy which can probably be removed.
  • Improving the documentation with more examples.
  • Better test coverage with more randomized tests.
  • Implementation of the matching algorithms in a lower-level language, for example C, both for performance and to make MatchPy's functionality available in other languages.

Contributing

If you have some issue or want to contribute, please feel free to open an issue or create a pull request. Help is always appreciated!

The Makefile has several tasks to help development:

  • To install all needed packages, you can use make init .
  • To run the tests you can use make test. The tests use pytest.
  • To generate the documentation you can use make docs .
  • To run the style checker (pylint) you can use make check .

If you have any questions or need help with setting things up, please open an issue and we will try the best to assist you.

Publications

Manuel Krebber and Henrik Barthels
Journal of Open Source Software, Volume 3(26), pp. 2, June 2018.

Manuel Krebber, Henrik Barthels and Paolo Bientinesi
Proceedings of the 7th Workshop on Python for High-Performance and Scientific Computing, November 2017.

Manuel Krebber, Henrik Barthels and Paolo Bientinesi
Proceedings of the 15th Python in Science Conference, July 2017.

Manuel Krebber
Master Thesis, RWTH Aachen University, May 2017

If you want to cite MatchPy, please reference the JOSS paper:

@article{krebber2018,
    author    = {Manuel Krebber and Henrik Barthels},
    title     = {{M}atch{P}y: {P}attern {M}atching in {P}ython},
    journal   = {Journal of Open Source Software},
    year      = 2018,
    pages     = 2,
    month     = jun,
    volume    = {3},
    number    = {26},
    doi       = "10.21105/joss.00670",
    web       = "http://joss.theoj.org/papers/10.21105/joss.00670",
}
Comments
  • `ManyToOneReplacer` slow for small expressions

    `ManyToOneReplacer` slow for small expressions

    I have added ~1700 ReplacementRules in ManyToOneReplacer. The .replace() has become very slow even for very small expressions(like 2*x).

    File: https://github.com/parsoyaarihant/sympy/blob/502fd311d64155560383a5f4f3b2710c318dc410/sympy/integrals/rubi/patterns.py

    opened by parsoyaarihant 38
  • Added internal implementation of the Hopcroft-Karp algorithm

    Added internal implementation of the Hopcroft-Karp algorithm

    Added Hopcroft-Karp algorithm in order to free MatchPy of its GPL dependency. Readapted from the C++ implementation in SymEngine.

    Translating the C++ implementation in SymEngine back to Python.

    I'm the author of the C++ implementation so no copyright notice mentioning SymEngine is necessary.

    opened by Upabjojr 23
  • `ManyToOneReplacer` for Rubi

    `ManyToOneReplacer` for Rubi

    Hi, I implemented ~80 rules using ManyToOneReplacer for Rubi integration and MatchPy was able to match the subjects very quickly. However, when I added ~550 rules, there was significant decrease in speed while matching even smaller subjects.

    It would be great if you could advice us on how to solve this issue.

    Here are the files for our project:

    Patterns: https://github.com/parsoyaarihant/sympy/blob/rubi4/sympy/rubi/patterns.py Constraint: https://github.com/parsoyaarihant/sympy/blob/rubi4/sympy/rubi/constraint.py All Rubi Files: https://github.com/parsoyaarihant/sympy/tree/rubi4/sympy/rubi

    opened by parsoyaarihant 21
  • Optional arguments in machpy

    Optional arguments in machpy

    I wanted to know if matchpy supports optional arguments during match.

    Mathematica: https://reference.wolfram.com/language/ref/Optional.html

    Example:

    >>> x = Symbol('x')
    >>> a_ = Wildcard.dot('a', optional=1)
    >>> a_free = CustomConstraint(lambda a, x: str(x) not in a.symbols)
    >>> x_ = Wildcard.dot('x')
    >>> pattern = Pattern(Mul(a_, x_), a_free)
    >>> next(match(x, pattern))
    {a: 1}
    
    opened by parsoyaarihant 20
  • Fixing the code generator

    Fixing the code generator

    The code generator enforces the reading of all variables whenever a constraint has been encountered. In this PR, the constraint is check only if all variables have been read.

    Furthermore, indentation has been changes to four spaces and a related bug forcing to always use bugs has been fixed as well.

    opened by Upabjojr 15
  • Repeated definition of the same CustomConstraint

    Repeated definition of the same CustomConstraint

    In the current RUBI rules patterns in SymPy, the same constraint is defined many times. For example: CustomConstraint(lambda a, x: FreeQ(a, x)) has its definition repeated more and more, like in:

    https://github.com/sympy/sympy/blob/628286fb5cc2248002bb0b141a911757a2e81fa7/sympy/integrals/rubi/rules/miscellaneous_algebraic.py#L22

    and in the next rule the same constraint is redefined: https://github.com/sympy/sympy/blob/628286fb5cc2248002bb0b141a911757a2e81fa7/sympy/integrals/rubi/rules/miscellaneous_algebraic.py#L26

    Python defines a new object for every new lambda definition, even if the functions return the same identical value.

    Does this practice have some potential negative consequences, like:

    • the size of the generated decision tree?
    • the overall speed of MatchPy?
    opened by Upabjojr 13
  • using `int` as Wildcard

    using `int` as Wildcard

    Hi, I am trying to define class ConstantWild (similar to ConstantSymbol you suggested earlier).

    Here is the code:

    from matchpy import Wildcard
    
    class ConstantWild(Wildcard):
        def __init__(self, value):
            super(self.__class__, self).__init__(min_count=1, fixed_size=True, variable_name=str(value))
            self.value = value
    
        def __str__(self):
            return str(self.value)
    

    The above code have same attributes as Wildcard.dot but I don't understand why it is giving errors.

    opened by parsoyaarihant 9
  • Custom sorting key for substitute( )

    Custom sorting key for substitute( )

    Substitute now accepts custom sorting key to specify custom sorting for elements in Multiset.

    This should solve the compatibility problem with SymPy which does not allow comparison operators (>, >=, <, <=) to be evaluated to booleans if the truth of the expression cannot be easily determined.

    opened by Upabjojr 8
  • Need Help in code generation.

    Need Help in code generation.

    I could not find any documentation related to code generation. Last year @wheerd had written a code generation script for rubi in sympy. https://gist.github.com/wheerd/13ac5a0e9b560db6201d96136fa0bbcc

    But code generated was huge so, It cannot be used. This year I have removed the repeated definition of constraints. So code generation should work probably.

    However, I am unable to write the code generation script. Here is the new structure of rubi in sympy. Can someone update the gist accordingly?

    rules = [] #list to keep track which rules has been applied
    def cons_f1(m, x):
        return FreeQ(m, x)
    cons1 = CustomConstraint(cons_f1)
    
    def cons_f2(m):
        return NonzeroQ(m + S(1))
    cons2 = CustomConstraint(cons_f2)
    
    pattern1 = Pattern(Integral(x_**WC('m', S(1)), x_), cons1, cons2)
    def replacement1(m, x):
        rules.append(1)
        return Simp(x**(m + S(1))/(m + S(1)), x)
    rule1 = ReplacementRule(pattern1, replacement1)
    
    help wanted question 
    opened by ashishkg0022 7
  • Alternate way to define constraint

    Alternate way to define constraint

    I wanted to know if there is alternatative way to define constraint in matchpy. Example:

    
    def FreeQ(vars, x): # returns True of any `var` contains `x`
    	if any(i.contains(x) for i in vars):
    		return False
    	return True
    
    def NonzeroQ(e): # returns True if `e` is not zero.
    	return e != 0
    
    pattern = Pattern(Int(Pow(Add(a_, Mul(b_, x_)), m_), x_), FreeQ(list(a, b, m), x), NonzeroQ(Add(m, 1)))
    

    Thanks.

    opened by parsoyaarihant 7
  • GPL'd dependency: hopcroftkarp library

    GPL'd dependency: hopcroftkarp library

    The hopcroftkarp library upon which this project depends is GPL'd.

    I have re-implemented the Hopcroft-Karp algorithm in C++ as part of an attempt to make MatchPy usable in SymEngine (that is, SymPy's core rewritten in C++). I didn't perform extensive testing, so there may be bugs. That code can be translated into Python.

    enhancement 
    opened by Upabjojr 6
  • FreeQ equivalent in MatchPy?

    FreeQ equivalent in MatchPy?

    Mathematica has FreeQ to test whether an expression contains a symbol. This is very useful in pattern matching for equations as you can specify that, for example, in a * x + b == 0 the variables a and b should not contain the variable x.

    In SymPy we are currently using things like CustomConstraint(lambda a, x: not a.has(x)), where a.has(x) is a SymPy expression that tells you if x is contained in the expression tree of a.

    Would it make sense to add an optimized FreeQ-like tester that checks whether the variable is contained in the expression during the matching iteration of MatchPy?

    opened by Upabjojr 1
  • Optional wildcards to match the identity element of the current node

    Optional wildcards to match the identity element of the current node

    When using the optional=value argument in Wildcard, one can specify which value to get in case no matching is found. The value has to be given for every wildcard.

    In case of addition and multiplication, the usual optional values are the identity elements, zero and one respectively.

    It would be nice to have the possibility to have the optional value to return the identity element of the current node, if available.

    enhancement 
    opened by Upabjojr 7
  • substitute not working with SymPy

    substitute not working with SymPy

    Calling sorted( ) in https://github.com/HPAC/matchpy/blob/5cae3f275e3a1f725518516bf3c700dc3be03a56/matchpy/functions.py#L87 fails with SymPy, as SymPy objects are not comparable with operators (<, <=, >, >=). Operators are overloaded in SymPy to create instances of inequality objects.

    SymPy has a function called default_sort_key meant to deal with this problem:

    from sympy import symbols
    x, y, z = symbols("x y z")
    from sympy.core.compatibility import default_sort_key
    sorted([z, x, y], key=default_sort_key)
    

    The question is, is it possible to modify MatchPy to specify a custom sorting key so that SymPy can specify the way its objects should be sorted?

    Or is it an issue with Multiset?

    opened by Upabjojr 11
  • One-to-one matching: symbols with variable_name do not work in commutative functions

    One-to-one matching: symbols with variable_name do not work in commutative functions

    In one-to-one matching, symbols that have a variable_name do not work in commutative functions. They seem to work correctly in many-to-one matching. Example:

    from matchpy import Operation, Symbol, Arity, Pattern, match, ManyToOneMatcher
    
    f = Operation.new('f', Arity.binary, commutative=True)
    a_x = Symbol('a', variable_name='x')
    a = Symbol('a')
    b = Symbol('b')
    
    subject = f(a, b)
    pattern = Pattern(f(a_x, b))
    
    print(list(match(subject, pattern)))
    
    matcher = ManyToOneMatcher(pattern)
    print(list(matcher.match(subject)))
    

    This results in

    []
    [(Pattern(f(a: x, b)), {'x': Symbol('a')})]
    

    but it should result in

    [{'x': Symbol('a')}]
    [(Pattern(f(a: x, b)), {'x': Symbol('a')})]
    

    So far, we don't have any tests that verify this functionality. Once it is fixed, we should add some.

    bug 
    opened by hbarthels 0
  • Match arbitrary function

    Match arbitrary function

    Is there any way to match a function with a certain number of arguments? Suppose I've defined a binary function sum, is there a way to create a pattern f(a, b) that matches sum(x, y)?

    enhancement 
    opened by sidmani 4
  • Improve checking correctness of input

    Improve checking correctness of input

    To avoid issues such as #60, we should check that objects are created and functions are used correctly, at least in those cases where so far, incorrect input does not produce any errors.

    enhancement 
    opened by hbarthels 0
Releases(0.5.5)
Owner
High-Performance and Automatic Computing
Umeå University & RWTH Aachen
High-Performance and Automatic Computing
reproduces experiments from

Installation To enable importing of modules, from the parent directory execute: pip install -e . To install requirements: python -m pip install requir

Meta Research 15 Aug 11, 2022
【AI创造营】参赛作品

-AI-emmmm 【AI创造营】参赛作品 鬼畜小视频 AiStuido地址:https://aistudio.baidu.com/aistudio/projectdetail/1647685 BiliBili视频地址:https://www.bilibili.com/video/BV1Zv411b

107 Nov 09, 2022
An alternative site to emplea.do due to inconsistent service of the app.

feline a agile and fast alternative to emplea.do License: MIT Settings Moved to settings. Basic Commands Setting Up Your Users To create a normal user

Codetiger 8 Nov 10, 2021
Python module to work with Magneto Database directly without using broken Magento 2 core

Python module to work with Magneto Database directly without using broken Magento 2 core

Egor Shitikov 13 Nov 10, 2022
A Python feed reader library.

reader is a Python feed reader library. It aims to allow writing feed reader applications without any business code, and without enforcing a dependenc

266 Dec 30, 2022
A Python wrapper API for operating and working with the Neo4j Graph Data Science (GDS) library

gdsclient NOTE: This is a work in progress and many GDS features are known to be missing or not working properly. This repo hosts the sources for gdsc

Neo4j 100 Dec 20, 2022
User management system (UMS), has the primary purpose of connecting to an Active Directory (AD)

💿 Sistema de Gerenciamento de Usuário (SGU) 📚 Sobre o projeto Sistema de gerenciamento de usuários (SGU), tem o objetivo primário de se conectar a u

Patrick Viegas 2 Feb 25, 2022
Integration between the awesome window manager and the firefox web browser.

Integration between the awesome window manager and the firefox web browser.

contribuewwt 3 Feb 02, 2022
tg-nearby Trilateration of nearby Telegram users as described in my corresponding article.

tg-nearby Trilateration of nearby Telegram users as described in my corresponding article. Setup If you want to toy with the code in this repository

Maximilian Jugl 75 Dec 26, 2022
Terrible python code from the "bubble that breaks maths" video.

Terrible python code from the "bubble that breaks maths" video.

Stand-up Maths 12 Oct 25, 2022
Service for working with open data of the State Duma of the Russian Federation

Сервис для работы с открытыми данными Госдумы РФ Исходные данные из API Госдумы РФ извлекаются с помощью Apache Nifi и приземляются в хранилище Clickh

Aleksandr Sergeenko 2 Feb 14, 2022
A simple app that helps to train quick calculations.

qtcounter A simple app that helps to train quick calculations. Usage Manual Clone the repo in a folder using git clone https://github.com/Froloket64/q

0 Nov 27, 2021
Make your functions return something meaningful, typed, and safe!

Make your functions return something meaningful, typed, and safe! Features Brings functional programming to Python land Provides a bunch of primitives

dry-python 2.5k Jan 03, 2023
Заглушки .NET библиотек для IronPython

Код репозитория основан на ironpython-stubs. Выражаю gtalarico бесконечную благодарность за вклад в развитие сообщества разработчиков скриптов и плаги

12 Nov 23, 2022
Simple but maybe too simple config management through python data classes. We use it for machine learning.

👩‍✈️ Coqpit Simple, light-weight and no dependency config handling through python data classes with to/from JSON serialization/deserialization. Curre

coqui 67 Nov 29, 2022
This repo contains scripts that add functionality to xbar.

xbar-custom-plugins This repo contains scripts that add functionality to xbar. Usage You have to add scripts to xbar plugin folder. If you don't find

osman uygar 1 Jan 10, 2022
Simple Python-based web application to allow UGM students to fill their QR presence list without having another device in hand.

Praesentia Praesentia is a simple Python-based web application to allow UGM students to fill their QR presence list without having another device in h

loncat 20 Sep 29, 2022
Generate Azure Blob Storage account authentication headers for Munki

Azure Blob Storage Authentication for Munki The Azure Blob Storage Middleware allows munki clients to connect securely, and directly to a munki repo h

Oliver Kieselbach 10 Apr 12, 2022
Run Python code right in your Telegram messages

Run Python code right in your Telegram messages Made with Telethon library, TGPy is a tool for evaluating expressions and Telegram API scripts. Instal

29 Nov 22, 2022
A casual IDOR exploiter that provides .csv files of url and status code.

IDOR-for-the-casual Do you like to IDOR? Are you a Windows hax0r? Well have I got a tool for you... A casual IDOR exploiter that provides .csv files o

Ben Wildee 2 Jan 20, 2022