A Python 3 library making time series data mining tasks, utilizing matrix profile algorithms

Last update: Dec 29, 2022

Overview

MatrixProfile

MatrixProfile is a Python 3 library, brought to you by the Matrix Profile Foundation, for mining time series data. The Matrix Profile is a novel data structure with corresponding algorithms (stomp, regimes, motifs, etc.) developed by the Keogh and Mueen research groups at UC-Riverside and the University of New Mexico. The goal of this library is to make these algorithms accessible to both the novice and expert through standardization of core concepts, a simplistic API, and sensible default parameter values.

In addition to this Python library, the Matrix Profile Foundation, provides implementations in other languages. These languages have a pretty consistent API allowing you to easily switch between them without a huge learning curve.

tsmp - an R implementation
go-matrixprofile - a Golang implementation

Python Support

Currently, we support the following versions of Python:

Python 2 is no longer supported. There are earlier versions of this library that support Python 2.

Installation

The easiest way to install this library is using pip or conda. If you would like to install it from source, please review the installation documentation for your platform.

Installation with pip

pip install matrixprofile

Installation with conda

conda config --add channels conda-forge
conda install matrixprofile

Getting Started

This article provides introductory material on the Matrix Profile: Introduction to Matrix Profiles

This article provides details about core concepts introduced in this library: How To Painlessly Analyze Your Time Series

Our documentation provides a quick start guide, examples and api documentation. It is the source of truth for getting up and running.

Algorithms

For details about the algorithms implemented, including performance characteristics, please refer to the documentation.

Getting Help

We provide a dedicated Discord channel where practitioners can discuss applications and ask questions about the Matrix Profile Foundation libraries. If you rather not join Discord, then please open a Github issue.

Contributing

Please review the contributing guidelines located in our documentation.

Code of Conduct

Please review our Code of Conduct documentation.

Citations

All proper acknowledgements for works of others may be found in our citation documentation.

Citing

Please cite this work using the Journal of Open Source Software article.

Van Benschoten et al., (2020). MPA: a novel cross-language API for time series analysis. Journal of Open Source Software, 5(49), 2179, https://doi.org/10.21105/joss.02179

@article{Van Benschoten2020,
    doi = {10.21105/joss.02179},
    url = {https://doi.org/10.21105/joss.02179},
    year = {2020},
    publisher = {The Open Journal},
    volume = {5},
    number = {49},
    pages = {2179},
    author = {Andrew Van Benschoten and Austin Ouyang and Francisco Bischoff and Tyler Marrs},
    title = {MPA: a novel cross-language API for time series analysis},
    journal = {Journal of Open Source Software}
}

A Python 3 library making time series data mining tasks, utilizing matrix profile algorithms

Related tags

Overview

MatrixProfile

Python Support

Installation

Getting Started

Algorithms

Getting Help

Contributing

Code of Conduct

Citations

Citing

Owner

Matrix Profile Foundation

Lale is a Python library for semi-automated data science.

wikirepo is a Python package that provides a framework to easily source and leverage standardized Wikidata information

Udacity - Data Analyst Nanodegree - Project 4 - Wrangle and Analyze Data

scikit-survival is a Python module for survival analysis built on top of scikit-learn.

Tokyo 2020 Paralympics, Analytics

songplays datamart provide details about the musical taste of our customers and can help us to improve our recomendation system

International Space Station data with Python research 🌎

Churn prediction with PySpark

Analysis scripts for QG equations

Calculate multilateral price indices in Python (with Pandas and PySpark).

Supply a wrapper ``StockDataFrame`` based on the ``pandas.DataFrame`` with inline stock statistics/indicators support.

Using Data Science with Machine Learning techniques (ETL pipeline and ML pipeline) to classify received messages after disasters.

The OHSDI OMOP Common Data Model allows for the systematic analysis of healthcare observational databases.

VHub - An API that permits uploading of vulnerability datasets and return of the serialized data

Intercepting proxy + analysis toolkit for Second Life compatible virtual worlds

A Python package for the mathematical modeling of infectious diseases via compartmental models

Performance analysis of predictive (alpha) stock factors

Incubator for useful bioinformatics code, primarily in Python and R

[CVPR2022] This repository contains code for the paper "Nested Collaborative Learning for Long-Tailed Visual Recognition", published at CVPR 2022

The repo for mlbtradetrees.com. Analyze any trade in baseball history!