FAMIE is a comprehensive and efficient active learning (AL) toolkit for multilingual information extraction (IE)

Last update: Sep 01, 2022

Related tags

Deep Learning famie

Overview

FAMIE: A Fast Active Learning Framework for Multilingual Information Extraction

FAMIE is a comprehensive and efficient active learning (AL) toolkit for multilingual information extraction (IE). FAMIE is designed to address a fundamental problem in existing AL frameworks where annotators need to wait for a long time between annotation batches due to the time-consuming nature of model training and data selection at each AL iteration. With a novel proxy AL mechanism and the integration of our SOTA multilingual toolkit Trankit, FAMIE can quickly provide users with a labeled dataset and a ready-to-use model for different IE tasks over 100 languages.

FAMIE's documentation page: https://famie.readthedocs.io

FAMIE's demo website: http://nlp.uoregon.edu:9000/

Installation

FAMIE can be easily installed via one of the following methods:

Using pip

pip install famie

The command would install FAMIE and all dependent packages automatically.

From source

git clone https://github.com/nlp-uoregon/famie.git
cd famie
pip install -e .

This would first clone our github repo and install FAMIE.

Usage

FAMIE currently supports Named Entity Recognition and Event Detection for over 100 languages. Using FAMIE includes three following steps:

Start an annotation session.
Annotate data for a target task.
Access the labeled data and a ready-to-use model returned by FAMIE.

Starting an annotation session

To start an annotation session, please use the following command:

famie start

This will run a server on users' local machines (no data or models will leave users' local machines), users can access FAMIE's web interface via the URL: http://127.0.0.1:9000/ . As FAMIE is an AL framework, it provides different data selection algorithms that recommend users the most beneficial examples to label at each annotation iteration. This is done via passing an optional argument --selection [mnlp|badge|bertkm|random].

Annotating data

Accessing the labeled data and the trained model

import famie

# access a project via its name
p = famie.get_project('named-entity-recognition') 

# access the project's labeled data
data = p.get_labeled_data() # a Python dictionary

# export the project's labeled data to a file
p.export_labeled_data('data.json')

# export the project's trained model to a file
p.export_trained_model('model.ckpt')

# access the project's trained model
model = p.get_trained_model()

# access a trained model from file
model = famie.load_model_from_file('model.ckpt')

# use the trained model to make predicions
model.predict('Oregon is a beautiful state!')
# ['B-Location', 'O', 'O', 'O', 'O']

FAMIE is a comprehensive and efficient active learning (AL) toolkit for multilingual information extraction (IE)

Related tags

Overview

FAMIE: A Fast Active Learning Framework for Multilingual Information Extraction

Installation

Using pip

From source

Usage

Starting an annotation session

Annotating data

Accessing the labeled data and the trained model

Owner

Dcf-game-infrastructure-public - Contains all the components necessary to run a DC finals (attack-defense CTF) game from OOO

Code and dataset for AAAI 2021 paper FixMyPose: Pose Correctional Describing and Retrieval Hyounghun Kim, Abhay Zala, Graham Burri, Mohit Bansal.

Real-Time SLAM for Monocular, Stereo and RGB-D Cameras, with Loop Detection and Relocalization Capabilities

A decent AI that solves daily Wordle puzzles. Works with different websites with similar wordlists,.

This repository contains the code for the paper "Hierarchical Motion Understanding via Motion Programs"

J.A.R.V.I.S is an AI virtual assistant made in python.

MoveNetを用いたPythonでの姿勢推定のデモ

A booklet on machine learning systems design with exercises

学习 python3 以来写的一些垃圾玩具……

Melanoma Skin Cancer Detection using Convolutional Neural Networks and Transfer Learning🕵🏻‍♂️

Deep learning models for change detection of remote sensing images

Implementation of the Remixer Block from the Remixer paper, in Pytorch

TensorFlow Ranking is a library for Learning-to-Rank (LTR) techniques on the TensorFlow platform

An open-source Deep Learning Engine for Healthcare that aims to treat & prevent major diseases

code for "Feature Importance-aware Transferable Adversarial Attacks"

Group Fisher Pruning for Practical Network Compression(ICML2021)

Ganilla - Official Pytorch implementation of GANILLA

Official implementation of Deep Convolutional Dictionary Learning for Image Denoising.

A more easy-to-use implementation of KPConv

Object detection using yolo-tiny model and opencv used as backend