FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data

Last update: Sep 06, 2022

Overview

FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data

Flexible EM-Inspired Discriminant Analysis is a robust supervised classification algorithm that performs well in noisy and contaminated datasets.

Authors

Andrew Wang, University of Cambridge, Cambridge, UK Pierre Houdouin, CentraleSupélec, Paris, France

Instllation

pip install -i https://test.pypi.org/simple/ femda

Get started

>>> from sklearn.datasets import load_iris
>>> from femda import FEMDA
>>> X, y = load_iris(return_X_y=True)
>>> clf = FEMDA()
>>> clf.fit(X, y)
FEMDA()
>>> clf.score(X, y)
0.9666666666666667

Using a specific dataset...

>> FEMDA().fit(X_train, y_train).score(X_test, y_test) ...">

>>> import femda.experiments.preprocessing as pre
>>> X_train, y_train, X_test, y_test = pre.statlog(r"root\datasets\\")
>>> FEMDA().fit(X_train, y_train).score(X_test, y_test)
...

Using a sklearn.pipeline.Pipeline...

>>> from sklearn.datasets import load_digits
>>> from sklearn.pipeline import make_pipeline
>>> from sklearn.decomposition import PCA
>>> X, y = load_digits(return_X_y=True)
>>> pipe = make_pipeline(PCA(n_components=5), FEMDA()).fit(X, y)
>>> pipe.predict(X)
...

Run all experiments presented in the paper

>>> from femda.experiments import run_experiments()
>>> run_experiments()
...

See for more.

Abstract

Linear and Quadraic Discriminant Analysis are well-known classical methods but suffer heavily from non-Gaussian class distributions and are very non-robust in contaminated datasets. In this paper, we present a new discriminant analysis style classification algorithm that directly models noise and diverse shapes which can deal with a wide range of datasets.

Each data point is modelled by its own arbitrary Elliptically Symmetrical (ES) distribution and its own arbitrary scale parameter, modelling directly very heterogeneous, non-i.i.d datasets. We show that maximum-likelihood parameter estimation and classification are simple and fast under this model.

We highlight the flexibility of the model to a wide range of Elliptically Symmetrical distribution shapes and varying levels of contamination in synthetic datasets. Then, we show that our algorithm outperforms other robust methods on contaminated datasets from Computer Vision and NLP.

FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data

Related tags

Overview

FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data

Authors

Instllation

Get started

Run all experiments presented in the paper

Abstract

Owner

Code for the ICCV 2021 paper "Pixel Difference Networks for Efficient Edge Detection" (Oral).

Real time sign language recognition

GLNet for Memory-Efficient Segmentation of Ultra-High Resolution Images

The Easy-to-use Dialogue Response Selection Toolkit for Researchers

Scientific Computation Methods in C and Python (Open for Hacktoberfest 2021)

Pytorch implementation of the paper: "A Unified Framework for Separating Superimposed Images", in CVPR 2020.

PyTorch implementation of Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation.

Align before Fuse: Vision and Language Representation Learning with Momentum Distillation

Compact Bilinear Pooling for PyTorch

A general-purpose programming language, focused on simplicity, safety and stability.

A particular navigation route using satellite feed and can help in toll operations & traffic managemen

GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation. (CVPR 2021)

This implements the learning and inference/proposal algorithm described in "Learning to Propose Objects, Krähenbühl and Koltun"

House_prices_kaggle - Predict sales prices and practice feature engineering, RFs, and gradient boosting

Code of the paper "Shaping Visual Representations with Attributes for Few-Shot Learning (ASL)".

All-in-one Docker container that allows a user to explore Nautobot in a lab environment.

🌊 Online machine learning in Python

DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers

The first dataset on shadow generation for the foreground object in real-world scenes.

TargetAllDomainObjects - A python wrapper to run a command on against all users/computers/DCs of a Windows Domain