Relevance Vector Machine implementation using the scikit-learn API.

Last update: Nov 18, 2022

Related tags

Overview

scikit-rvm

scikit-rvm is a Python module implementing the Relevance Vector Machine (RVM) machine learning technique using the scikit-learn API.

Quickstart

With NumPy, SciPy and scikit-learn available in your environment, install with:

pip install https://github.com/JamesRitchie/scikit-rvm/archive/master.zip

Regression is done with the RVR class:

>>> from skrvm import RVR
>>> X = [[0, 0], [2, 2]]
>>> y = [0.5, 2.5 ]
>>> clf = RVR(kernel='linear')
>>> clf.fit(X, y)
RVR(alpha=1e-06, beta=1e-06, beta_fixed=False, bias_used=True, coef0=0.0,
coef1=None, degree=3, kernel='linear', n_iter=3000,
threshold_alpha=1000000000.0, tol=0.001, verbose=False)
>>> clf.predict([[1, 1]])
array([ 1.49995187])

Classification is done with the RVC class:

>>> from skrvm import RVC
>>> from sklearn.datasets import load_iris
>>> clf = RVC()
>>> clf.fit(iris.data, iris.target)
RVC(alpha=1e-06, beta=1e-06, beta_fixed=False, bias_used=True, coef0=0.0,
coef1=None, degree=3, kernel='rbf', n_iter=3000, n_iter_posterior=50,
threshold_alpha=1000000000.0, tol=0.001, verbose=False)
>>> clf.score(iris.data, iris.target)
0.97999999999999998

Theory

The RVM is a sparse Bayesian analogue to the Support Vector Machine, with a number of advantages:

It provides probabilistic estimates, as opposed to the SVM's point estimates.
Typically provides a sparser solution than the SVM, which tends to have the number of support vectors grow linearly with the size of the training set.
Does not need a complexity parameter to be selected in order to avoid overfitting.

However it is more expensive to train than the SVM, although prediction is faster and no cross-validation runs are required.

The RVM's original creator Mike Tipping provides a selection of papers offering detailed insight into the formulation of the RVM (and sparse Bayesian learning in general) on a dedicated page, along with a Matlab implementation.

Most of this implementation was written working from Section 7.2 of Christopher M. Bishops's Pattern Recognition and Machine Learning.

Contributors

Future Improvements

Implement the fast Sequential Sparse Bayesian Learning Algorithm outlined in Section 7.2.3 of Pattern Recognition and Machine Learning
Handle ill-conditioning errors more gracefully.
Implement more kernel choices.
Create more detailed examples with IPython notebooks.

Relevance Vector Machine implementation using the scikit-learn API.

Related tags

Overview

scikit-rvm

Quickstart

Theory

Contributors

Future Improvements

Owner

James Ritchie

Scikit-learn compatible wrapper of the Random Bits Forest program written by (Wang et al., 2016)

hgboost - Hyperoptimized Gradient Boosting

Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.

This repository has datasets containing information of Uber pickups in NYC from April 2014 to September 2014 and January to June 2015. data Analysis , virtualization and some insights are gathered here

Time series changepoint detection

Crunchdao - Python API for the Crunchdao machine learning tournament

Made in collaboration with Chris George for Art + ML Spring 2019.

Bonsai: Gradient Boosted Trees + Bayesian Optimization

AP1 Transcription Factor Binding Site Prediction

A repository of PyBullet utility functions for robotic motion planning, manipulation planning, and task and motion planning

Painless Machine Learning for python based on scikit-learn

scikit-learn is a python module for machine learning built on top of numpy / scipy

Code for the TCAV ML interpretability project

This repository contains full machine learning pipeline of the Zillow Houses competition on Kaggle platform.

This is my implementation on the K-nearest neighbors algorithm from scratch using Python

Dragonfly is an open source python library for scalable Bayesian optimisation.

Learning --> Numpy January 2022 - winter'22

Python Automated Machine Learning library for tabular data.

Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

🌊 River is a Python library for online machine learning.