Towards Debiasing NLU Models from Unknown Biases

Overview

Towards Debiasing NLU Models from Unknown Biases

Abstract: NLU models often exploit biased features to achieve high dataset-specific performance without properly learning the intended task. Recently proposed debiasing methods are shown to be effective in mitigating this tendency. However, these methods rely on a major assumption that the type of biased features is known a-priori, which limits their application to many NLU tasks and datasets. In this work, we present the first step to bridge this gap by introducing a self-debiasing framework that prevents models from mainly utilizing biases without knowing them in advance. The proposed framework is general and complementary to the existing debiasing methods. We show that the proposed framework allows these existing methods to retain the improvement on the challenge datasets (i.e., sets of examples designed to expose models’ reliance to biases) without specifically targeting certain biases. Furthermore, the evaluation suggests that applying the framework results in improved overall robustness.

The repository contains the code to reproduce our work in debiasing NLU models without prior information on biases. We provide 3 runs of experiment that are shown in our paper:

  1. Debias MNLI model from syntactic bias and evaluate on HANS as the out-of-distribution data using example reweighting.
  2. Debias MNLI model from syntactic bias and evaluate on HANS as the out-of-distribution data using product of expert.
  3. Debias MNLI model from syntactic bias and evaluate on HANS as the out-of-distribution data using confidence regularization.

Requirements

The code requires python >= 3.6 and pytorch >= 1.1.0.

Additional required dependencies can be found in requirements.txt. Install all requirements by running:

pip install -r requirements.txt

Data

Our experiments use MNLI dataset version provided by GLUE benchmark. Download the file from here, and unzip under the directory ./dataset The dataset directory should be structured as the following:

└── dataset 
    └── MNLI
        ├── train.tsv
        ├── dev_matched.tsv
        ├── dev_mismatched.tsv
        ├── dev_mismatched.tsv

Running the experiments

For each evaluation setting, use the --mode arguments to set the appropriate loss function. Choose the annealed version of the loss function for reproducing the annealed results.

To reproduce our result on MNLI ⮕ HANS, run the following:

cd src/
CUDA_VISIBLE_DEVICES=9 python train_distill_bert.py \
  --output_dir ../experiments_self_debias_mnli_seed111/bert_reweighted_sampled2K_teacher_seed111_annealed_1to08 \
  --do_train --do_eval --mode reweight_by_teacher_annealed \
  --custom_teacher ../teacher_preds/mnli_trained_on_sample2K_seed111.json --seed 111 --which_bias hans

Biased examples identification

To obtain predictions of the shallow models, we train the same model architecture on the fraction of the dataset. For MNLI we subsample 2000 examples and train the model for 5 epochs. For obtaining shallow models of other datasets please see the appendix of our paper. The shallow model can be obtained with the command below:

cd src/
CUDA_VISIBLE_DEVICES=9 python train_distill_bert.py \
 --output_dir ../experiments_shallow_mnli/bert_base_sampled2K_seed111 \
 --do_train --do_eval --do_eval_on_train --mode none\
 --seed 111 --which_bias hans --debug --num_train_epochs 5 --debug_num 2000

Once the training and the evaluation on train set is done, copy the probability json files in the output directory to ../teacher_preds/mnli_trained_on_sample2K_seed111.json.

Expected results

Results on the MNLI ⮕ HANS setting without annealing:

Mode Seed MNLI-m MNLI-mm HANS avg.
None 111 84.57 84.72 62.04
reweighting 111 81.8 82.3 72.1
PoE 111 81.5 81.1 70.3
conf-reg 222 83.7 84.1 68.7
Owner
Ubiquitous Knowledge Processing Lab
Ubiquitous Knowledge Processing Lab
9th place solution

AllDataAreExt-Galixir-Kaggle-HPA-2021-Solution Team Members Qishen Ha is Master of Engineering from the University of Tokyo. Machine Learning Engineer

daishu 5 Nov 18, 2021
Use graph-based analysis to re-classify stocks and to improve Markowitz portfolio optimization

Dynamic Stock Industrial Classification Use graph-based analysis to re-classify stocks and experiment different re-classification methodologies to imp

Sheng Yang 10 Dec 05, 2022
Husein pet projects in here!

project-suka-suka Husein pet projects in here! List of projects mysejahtera-density. Generate resolution points using meshgrid and request each points

HUSEIN ZOLKEPLI 47 Dec 09, 2022
pytorch implementation of Attention is all you need

A Pytorch Implementation of the Transformer: Attention Is All You Need Our implementation is largely based on Tensorflow implementation Requirements N

230 Dec 07, 2022
This git repo contains the implementation of my ML project on Heart Disease Prediction

Introduction This git repo contains the implementation of my ML project on Heart Disease Prediction. This is a real-world machine learning model/proje

Aryan Dutta 1 Feb 02, 2022
OCR Streamlit App is used to extract text from images using python's easyocr, pytorch and streamlit packages

OCR-Streamlit-App OCR Streamlit App is used to extract text from images using python's easyocr, pytorch and streamlit packages OCR app gets an image a

Siva Prakash 5 Apr 05, 2022
Learning to Adapt Structured Output Space for Semantic Segmentation, CVPR 2018 (spotlight)

Learning to Adapt Structured Output Space for Semantic Segmentation Pytorch implementation of our method for adapting semantic segmentation from the s

Yi-Hsuan Tsai 782 Dec 30, 2022
Code for Graph-to-Tree Learning for Solving Math Word Problems (ACL 2020)

Graph-to-Tree Learning for Solving Math Word Problems PyTorch implementation of Graph based Math Word Problem solver described in our ACL 2020 paper G

Jipeng Zhang 66 Nov 23, 2022
Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network Paddle-PANet 目录 结果对比 论文介绍 快速安装 结果对比 CTW1500 Method Backbone Fine

7 Aug 08, 2022
Groceries ARL: Association Rules (Birliktelik Kuralı)

Groceries_ARL Association Rules (Birliktelik Kuralı) Birliktelik kuralları, mark

Şebnem 5 Feb 08, 2022
Accompanying code for the paper "A Kernel Test for Causal Association via Noise Contrastive Backdoor Adjustment".

#backdoor-HSIC (bd_HSIC) Accompanying code for the paper "A Kernel Test for Causal Association via Noise Contrastive Backdoor Adjustment". To generate

Robert Hu 0 Nov 25, 2021
PyoMyo - Python Opensource Myo library

PyoMyo Python module for the Thalmic Labs Myo armband. Cross platform and multithreaded and works without the Myo SDK. pip install pyomyo Documentati

PerlinWarp 81 Jan 08, 2023
a Lightweight library for sequential learning agents, including reinforcement learning

SaLinA: SaLinA - A Flexible and Simple Library for Learning Sequential Agents (including Reinforcement Learning) TL;DR salina is a lightweight library

Facebook Research 405 Dec 17, 2022
SurfEmb (CVPR 2022) - SurfEmb: Dense and Continuous Correspondence Distributions

SurfEmb SurfEmb: Dense and Continuous Correspondence Distributions for Object Pose Estimation with Learnt Surface Embeddings Rasmus Laurvig Haugard, A

Rasmus Haugaard 56 Nov 19, 2022
Deploy pytorch classification model using Flask and Streamlit

Deploy pytorch classification model using Flask and Streamlit

Ben Seo 1 Nov 17, 2021
MoCoPnet - Deformable 3D Convolution for Video Super-Resolution

MoCoPnet: Exploring Local Motion and Contrast Priors for Infrared Small Target Super-Resolution Pytorch implementation of local motion and contrast pr

Xinyi Ying 28 Dec 15, 2022
Density-aware Single Image De-raining using a Multi-stream Dense Network (CVPR 2018)

DID-MDN Density-aware Single Image De-raining using a Multi-stream Dense Network He Zhang, Vishal M. Patel [Paper Link] (CVPR'18) We present a novel d

He Zhang 224 Dec 12, 2022
PyTorch implementation of MoCo: Momentum Contrast for Unsupervised Visual Representation Learning

MoCo: Momentum Contrast for Unsupervised Visual Representation Learning This is a PyTorch implementation of the MoCo paper: @Article{he2019moco, aut

Meta Research 3.7k Jan 02, 2023
Creating Multi Task Models With Keras

Creating Multi Task Models With Keras About The Project! I used the keras and Tensorflow Library, To build a Deep Learning Neural Network to Creating

Srajan Chourasia 4 Nov 28, 2022
YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with ONNX, TensorRT, ncnn, and OpenVINO supported.

Introduction YOLOX is an anchor-free version of YOLO, with a simpler design but better performance! It aims to bridge the gap between research and ind

7.7k Jan 03, 2023