MEND: Model Editing Networks using Gradient Decomposition

Setup

Environment

This codebase uses Python 3.7.9. Other versions may work as well.

Create a virtualenv (pyenv can help with this) and install the dependencies:

$ python -m venv env
$ source env/bin/activate
(env) $ pip install -r requirements.txt

Data

You can download the data needed for this project from this Google Drive link. Unzip each sub-directory into mend/data and you should be good to go.

Running the code

Run MEND training/evaluation for distilGPT-2 on the wikitext editing problem with:

(env) $ python -m run +alg=mend +experiment=gen +model=distilgpt2

Other valid algs include efk (KnowledgeEditor) and enn (Editable Neural Networks). Valid experiments include fc (FEVER fact checking) and qa (zsRE question-answering). Splits and rephrases for both come from De Cao et. al. Check config/model for options for editable models (note that all models don't work for all experiments; GPT-style models only work with gen, seq2seq models only work with qa, and BERT only works with fc).

Also note that in the paper, we sample locality data from different datasets depending on the model. By default, training will use Natural Questions data (not zsRE data) for computing drawdown in the qa experiment and OpenWebText. For models such as the distilgpt2 model we use (which was fine-tuned on wikitext) or the BART-base model, this behavior should be disabled with data.wiki_webtext=False or data.zsre_nq=False, respectively.

Citing the paper

If this code or paper was useful, please consider using the following citation:

@article{mitchell2021fast,
    title={Fast Model Editing at Scale},
    author={Mitchell, Eric and Lin, Charles and Bosselut, Antoine and Finn, Chelsea and Manning, Chris}
    year={2021}
}

MEND: Model Editing Networks using Gradient Decomposition

Related tags

Overview

MEND: Model Editing Networks using Gradient Decomposition

Setup

Environment

Data

Running the code

Citing the paper

Owner

Eric Mitchell

Tensorflow implementation of Fully Convolutional Networks for Semantic Segmentation

Code for our paper Aspect Sentiment Quad Prediction as Paraphrase Generation in EMNLP 2021.

Diverse graph algorithms implemented using JGraphT library.

FasterAI: A library to make smaller and faster models with FastAI.

InvTorch: memory-efficient models with invertible functions

Implementation of the famous Image Manipulation\Forgery Detector "ManTraNet" in Pytorch

Implementation of Bagging and AdaBoost Algorithm

CCAFNet: Crossflow and Cross-scale Adaptive Fusion Network for Detecting Salient Objects in RGB-D Images

Point cloud processing tool library.

Statistical-Rethinking-with-Python-and-PyMC3 - Python/PyMC3 port of the examples in " Statistical Rethinking A Bayesian Course with Examples in R and Stan" by Richard McElreath

An implementation of Geoffrey Hinton's paper "How to represent part-whole hierarchies in a neural network" in Pytorch.

Official PyTorch implementation of our AAAI22 paper: TransMEF: A Transformer-Based Multi-Exposure Image Fusion Framework via Self-Supervised Multi-Task Learning. Code will be available soon.

In this project we use both Resnet and Self-attention layer for cat, dog and flower classification.

Facial expression detector

As-ViT: Auto-scaling Vision Transformers without Training

On-device speech-to-index engine powered by deep learning.

Pretrained models for Jax/Haiku; MobileNet, ResNet, VGG, Xception.

Fedlearn支持前沿算法研发的Python工具库 | Fedlearn algorithm toolkit for researchers

App customer segmentation cohort rfm clustering

A video scene detection algorithm is designed to detect a variety of different scenes within a video