Universal Adversarial Triggers for Attacking and Analyzing NLP (EMNLP 2019)

Last update: Dec 17, 2022

Related tags

Overview

Universal Adversarial Triggers for Attacking and Analyzing NLP

This is the official code for the EMNLP 2019 paper, Universal Adversarial Triggers for Attacking and Analyzing NLP. This repository contains the code for replicating our experiments and creating universal triggers.

Read our blog and our paper for more information on the method.

Dependencies

This code is written using PyTorch. The code for GPT-2 is based on HuggingFace's Transformer repo and the experiments on SQuAD, SNLI, and SST use AllenNLP. The code is flexible and should be generally applicable to most models (especially if its in AllenNLP), i.e., you can easily extend this code to work for the model or task you want.

The code is made to run on GPU, and a GPU is likely necessary due to the costs of running the larger models. I used one GTX 1080 for all the experiments; most experiments run in a few minutes. It is possible to run the SST and SNLI experiments without a GPU.

Installation

An easy way to install the code is to create a fresh anaconda environment:

conda create -n triggers python=3.6
source activate triggers
pip install -r requirements.txt

Now you should be ready to go!

Getting Started

The repository is broken down by task:

sst attacks sentiment analysis using the SST dataset (AllenNLP-based).
snli attacks natural language inference models on the SNLI dataset (AllenNLP-based).
squad attacks reading comprehension models using the SQuAD dataset (AllenNLP-based).
gpt2 attacks the GPT-2 language model using HuggingFace's model.

To get started, we recommend you start with snli or sst. In snli, we download pre-trained models (no training required) and create the triggers for the hypothesis sentence. In sst, we walk through training a simple LSTM sentiment analysis model in AllenNLP. It then creates universal adversarial triggers for that model. The code is well documented and walks you through the attack methodology.

The gradient-based attacks are written in attacks.py. The file utils.py contains the code for evaluating models, computing gradients, and evaluating the top candidates for the attack. utils.py is only used by the AllenNLP models (i.e., not for GPT-2).

References

Please consider citing our work if you found this code or our paper beneficial to your research.

@inproceedings{Wallace2019Triggers,
  Author = {Eric Wallace and Shi Feng and Nikhil Kandpal and Matt Gardner and Sameer Singh},
  Booktitle = {Empirical Methods in Natural Language Processing},                            
  Year = {2019},
  Title = {Universal Adversarial Triggers for Attacking and Analyzing {NLP}}
}

Contributions and Contact

This code was developed by Eric Wallace, contact available at [email protected].

If you'd like to contribute code, feel free to open a pull request. If you find an issue with the code, please open an issue.

Universal Adversarial Triggers for Attacking and Analyzing NLP (EMNLP 2019)

Related tags

Overview

Universal Adversarial Triggers for Attacking and Analyzing NLP

Dependencies

Installation

Getting Started

References

Contributions and Contact

Owner

Eric Wallace

[ICCV 2021] Counterfactual Attention Learning for Fine-Grained Visual Categorization and Re-identification

✔👉A Centralized WebApp to Ensure Road Safety by checking on with the activities of the driver and activating label generator using NLP.

An Analysis Toolkit for Natural Language Generation (Translation, Captioning, Summarization, etc.)

ProteinBERT is a universal protein language model pretrained on ~106M proteins from the UniRef90 dataset.

The proliferation of disinformation across social media has led the application of deep learning techniques to detect fake news.

Sploitus - Command line search tool for sploitus.com. Think searchsploit, but with more POCs

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

My implementation of Safaricom Machine Learning Codility test. The code has bugs, logical I guess I made errors and any correction will be appreciated.

Share constant definitions between programming languages and make your constants constant again

BiQE: Code and dataset for the BiQE paper

STT for TorchScript is a port of Coqui STT based on DeepSpeech to PyTorch.

An easy to use Natural Language Processing library and framework for predicting, training, fine-tuning, and serving up state-of-the-art NLP models.

Unsupervised Language Model Pre-training for French

Ceaser-Cipher - The Caesar Cipher technique is one of the earliest and simplest method of encryption technique

Product-Review-Summarizer - Created a product review summarizer which clustered thousands of product reviews and summarized them into a maximum of 500 characters, saving precious time of customers and helping them make a wise buying decision.

open-information-extraction-system, build open-knowledge-graph(SPO, subject-predicate-object) by pyltp(version==3.4.0)

Simple tool/toolkit for evaluating NLG (Natural Language Generation) offering various automated metrics.

초성 해석기 based on ko-BART

This repository describes our reproducible framework for assessing self-supervised representation learning from speech

Smart discord chatbot integrated with Dialogflow to manage different classrooms and assist in teaching!