Code for EMNLP2020 long paper: BERT-Attack: Adversarial Attack Against BERT Using BERT

Overview

BERT-ATTACK

Code for our EMNLP2020 long paper:

BERT-ATTACK: Adversarial Attack Against BERT Using BERT

Dependencies

Usage

To train a classification model, please use the run_glue.py script in the huggingface transformers==2.9.0.

To generate adversarial samples based on the masked-LM, run

python bertattack.py --data_path data_defense/imdb_1k.tsv --mlm_path bert-base-uncased --tgt_path models/imdbclassifier --use_sim_mat 1 --output_dir data_defense/imdb_logs.tsv --num_label 2 --use_bpe 1 --k 48 --start 0 --end 1000 --threshold_pred_score 0
  • --data_path: We take IMDB dataset as an example. Datasets can be obtained in TextFooler.
  • --mlm_path: We use BERT-base-uncased model as our target masked-LM.
  • --tgt_path: We follow the official fine-tuning process in transformers to fine-tune BERT as the target model.
  • --k 48: The threshold k is the number of possible candidates
  • --output_dir : The output file.
  • --start: --end: in case the dataset is large, we provide a script for multi-thread process.
  • --threshold_pred_score: a score in cutting off predictions that may not be suitable (details in Section5.1)

Note

The datasets are re-formatted to the GLUE style.

Some configs are fixed, you can manually change them.

If you need to use similar-words-filter, you need to download and process consine similarity matrix following TextFooler. We only use the filter in sentiment classification tasks like IMDB and YELP.

If you need to evaluate the USE-results, you need to create the corresponding tensorflow environment USE.

For faster generation, you could turn off the BPE substitution.

As illustrated in the paper, we set thresholds to balance between the attack success rate and USE similarity score.

The multi-thread process use the batchrun.py script

You can run

cat cmd.txt | python batchrun.py --gpus 0,1,2,3 

to simutaneously generate adversarial samples of the given dataset for faster generation. We use the IMDB dataset as an example.

Owner
Linyang Li
FudanNLP
Linyang Li
TensorFlow GNN is a library to build Graph Neural Networks on the TensorFlow platform.

TensorFlow GNN This is an early (alpha) release to get community feedback. It's under active development and we may break API compatibility in the fut

889 Dec 30, 2022
STARCH compuets regional extreme storm physical characteristics and moisture balance based on spatiotemporal precipitation data from reanalysis or climate model data.

STARCH (Storm Tracking And Regional CHaracterization) STARCH computes regional extreme storm physical and moisture balance characteristics based on sp

Onosama 7 Oct 20, 2022
LSTMs (Long Short Term Memory) RNN for prediction of price trends

Price Prediction with Recurrent Neural Networks LSTMs BTC-USD price prediction with deep learning algorithm. Artificial Neural Networks specifically L

5 Nov 12, 2021
Implementation of character based convolutional neural network

Character Based CNN This repo contains a PyTorch implementation of a character-level convolutional neural network for text classification. The model a

Ahmed BESBES 248 Nov 21, 2022
Learning Dense Representations of Phrases at Scale (Lee et al., 2020)

DensePhrases DensePhrases provides answers to your natural language questions from the entire Wikipedia in real-time. While it efficiently searches th

Princeton Natural Language Processing 540 Dec 30, 2022
The implementation of "Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Band Speech Enhancement"

SF-Net for fullband SE This is the repo of the manuscript "Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Ban

Guochen Yu 36 Dec 02, 2022
Python Auto-ML Package for Tabular Datasets

Tabular-AutoML AutoML Package for tabular datasets Tabular dataset tuning is now hassle free! Run one liner command and get best tuning and processed

Sagnik Roy 18 Nov 20, 2022
Diffusion Normalizing Flow (DiffFlow) Neurips2021

Diffusion Normalizing Flow (DiffFlow) Reproduce setup environment The repo heavily depends on jam, a personal toolbox developed by Qsh.zh. The API may

76 Jan 01, 2023
This is an open source python repository for various python tests

Welcome to Py-tests This is an open source python repository for various python tests. This is in response to the hacktoberfest2021 challenge. It is a

Yada Martins Tisan 3 Oct 31, 2021
Author: Wenhao Yu ([email protected]). ACL 2022. Commonsense Reasoning on Knowledge Graph for Text Generation

Diversifying Commonsense Reasoning Generation on Knowledge Graph Introduction -- This is the pytorch implementation of our ACL 2022 paper "Diversifyin

DM2 Lab @ ND 61 Dec 30, 2022
Cross-Modal Contrastive Learning for Text-to-Image Generation

Cross-Modal Contrastive Learning for Text-to-Image Generation This repository hosts the open source JAX implementation of XMC-GAN. Setup instructions

Google Research 94 Nov 12, 2022
Imaginaire - NVIDIA's Deep Imagination Team's PyTorch Library

Imaginaire Docs | License | Installation | Model Zoo Imaginaire is a pytorch library that contains optimized implementation of several image and video

NVIDIA Research Projects 3.6k Dec 29, 2022
PyTorch implementation of Super SloMo by Jiang et al.

Super-SloMo PyTorch implementation of "Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation" by Jiang H., Sun

Avinash Paliwal 2.9k Jan 03, 2023
Implementation of H-UCRL Algorithm

Implementation of H-UCRL Algorithm This repository is an implementation of the H-UCRL algorithm introduced in Curi, S., Berkenkamp, F., & Krause, A. (

Sebastian Curi 25 May 20, 2022
Implementation for paper LadderNet: Multi-path networks based on U-Net for medical image segmentation

Implementation for paper LadderNet: Multi-path networks based on U-Net for medical image segmentation This implementation is based on orobix implement

Juntang Zhuang 116 Sep 06, 2022
Learning Modified Indicator Functions for Surface Reconstruction

Learning Modified Indicator Functions for Surface Reconstruction In this work, we propose a learning-based approach for implicit surface reconstructio

4 Apr 18, 2022
Proof-Of-Concept Piano-Drums Music AI Model/Implementation

Rock Piano "When all is one and one is all, that's what it is to be a rock and not to roll." ---Led Zeppelin, "Stairway To Heaven" Proof-Of-Concept Pi

Alex 4 Nov 28, 2021
Structured Edge Detection Toolbox

################################################################### # # # Structure

Piotr Dollar 779 Jan 02, 2023
A visualization tool to show a TensorFlow's graph like TensorBoard

tfgraphviz tfgraphviz is a module to visualize a TensorFlow's data flow graph like TensorBoard using Graphviz. tfgraphviz enables to provide a visuali

44 Nov 09, 2022
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Segmentation Transformer Implementation of Segmentation Transformer in PyTorch, a new model to achieve SOTA in semantic segmentation while using trans

Abhay Gupta 161 Dec 08, 2022