TensorFlow implementation of an arbitrary order Factorization Machine

Last update: Dec 21, 2022

Overview

This is a TensorFlow implementation of an arbitrary order (>=2) Factorization Machine based on paper Factorization Machines with libFM.

It supports:

dense and sparse inputs
different (gradient-based) optimization methods
classification/regression via different loss functions (logistic and mse implemented)
logging via TensorBoard

The inference time is linear with respect to the number of features.

Tested on Python3.5, but should work on Python2.7

This implementation is quite similar to the one described in Blondel's et al. paper [https://arxiv.org/abs/1607.07195], but was developed independently and prior to the first appearance of the paper.

Dependencies

Installation

Stable version can be installed via pip install tffm.

Usage

The interface is similar to scikit-learn models. To train a 6-order FM model with rank=10 for 100 iterations with learning_rate=0.01 use the following sample

from tffm import TFFMClassifier
model = TFFMClassifier(
    order=6,
    rank=10,
    optimizer=tf.train.AdamOptimizer(learning_rate=0.01),
    n_epochs=100,
    batch_size=-1,
    init_std=0.001,
    input_type='dense'
)
model.fit(X_tr, y_tr, show_progress=True)

See example.ipynb and gpu_benchmark.ipynb for more details.

It's highly recommended to read tffm/core.py for help.

Testing

Just run python test.py in the terminal. nosetests works too, but you must pass the --logging-level=WARNING flag to avoid printing insane amounts of TensorFlow logs to the screen.

Citation

If you use this software in academic research, please, cite it using the following BibTeX:

@misc{trofimov2016,
author = {Mikhail Trofimov, Alexander Novikov},
title = {tffm: TensorFlow implementation of an arbitrary order Factorization Machine},
year = {2016},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/geffy/tffm}},
}

TensorFlow implementation of an arbitrary order Factorization Machine

Related tags

Overview

Dependencies

Installation

Usage

Testing

Citation

Owner

Mikhail Trofimov

Automated machine learning: Review of the state-of-the-art and opportunities for healthcare

A Software Framework for Neuromorphic Computing

cleanlab is the data-centric ML ops package for machine learning with noisy labels.

Apache Liminal is an end-to-end platform for data engineers & scientists, allowing them to build, train and deploy machine learning models in a robust and agile way

Kaggle Competition using 15 numerical predictors to predict a continuous outcome.

Dragonfly is an open source python library for scalable Bayesian optimisation.

PyNNDescent is a Python nearest neighbor descent for approximate nearest neighbors.

XManager: A framework for managing machine learning experiments 🧑‍🔬

Distributed Deep learning with Keras & Spark

moDel Agnostic Language for Exploration and eXplanation

vortex particles for simulating smoke in 2d

[DEPRECATED] Tensorflow wrapper for DataFrames on Apache Spark

A simple and lightweight genetic algorithm for optimization of any machine learning model

A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning

A handy tool for common machine learning models' hyper-parameter tuning.

Simple and flexible ML workflow engine.

UpliftML: A Python Package for Scalable Uplift Modeling

CinnaMon is a Python library which offers a number of tools to detect, explain, and correct data drift in a machine learning system

Probabilistic time series modeling in Python

Skoot is a lightweight python library of machine learning transformer classes that interact with scikit-learn and pandas.