[ICML 2021] Towards Understanding and Mitigating Social Biases in Language Models

Last update: Jan 03, 2023

Overview

Towards Understanding and Mitigating Social Biases in Language Models

This repo contains code and data for evaluating and mitigating bias from generation models.

Paper

Towards Understanding and Mitigating Social Biases in Language Models
Paul Pu Liang, Chiyu Wu, Louis-Philippe Morency, and Ruslan Salakhutdinov
ICML 2021

If you find this repository useful, please cite our paper:

@inproceedings{liang2021towards,
  title={Towards Understanding and Mitigating Social Biases in Language Models},
  author={Liang, Paul Pu and Wu, Chiyu and Morency, Louis-Philippe and Salakhutdinov, Ruslan},
  booktitle={International Conference on Machine Learning},
  pages={6565--6576},
  year={2021},
  organization={PMLR}
}

1. Identify bias-sensitive tokens, obtain bias subspace and create the dataset to train the bias classifier

python data_preprocess.py --embed_source glove --by_pca True --num_components 5 --save_subspace False

Glove embedding and gpt2 embedding are large files, you can download or extract them by yourself. We also provide the google drive link.

2. Train the bias classifier and learn the projection matrix P

python context_nullspace_projection.py

The code of nullspace projection is from INLP. Thanks for their great work!

To run the INLP experiments, you need to git clone https://github.com/shauli-ravfogel/nullspace_projection first, and put it under the root directory of this repo.

3. Evaluate Bias existing in the gpt2

Local Bias

cd src/local_bias
python measure_local_bias.py

It will take long time to run the evaluation script on the full data. Here we provide the subset of our evaluation data now. Full data will be uploaded via google drive soon.

Global Bias

We use the regard score difference as the metric for global bias. The evaluation code is from https://github.com/ewsheng/nlg-bias. Thanks for their great work!

git clone https://github.com/ewsheng/nlg-bias.git
cd src/global_bias
python generate_full_sentence.py --algorithm INLP

After full sentences are generated, you need to use the regard classifier to measure the global bias.

To reproduce the result in our paper, we also provide the projection matrix P for the gender bias test in data/saved_P/P_gender_test_79.npy

[ICML 2021] Towards Understanding and Mitigating Social Biases in Language Models

Related tags

Overview

Towards Understanding and Mitigating Social Biases in Language Models

Paper

1. Identify bias-sensitive tokens, obtain bias subspace and create the dataset to train the bias classifier

2. Train the bias classifier and learn the projection matrix P

3. Evaluate Bias existing in the gpt2

Local Bias

Global Bias

Acknowledgements

Owner

Paul Liang

code for ICCV 2021 paper 'Generalized Source-free Domain Adaptation'

A Broad Study on the Transferability of Visual Representations with Contrastive Learning

FastCover: A Self-Supervised Learning Framework for Multi-Hop Influence Maximization in Social Networks by Anonymous.

The official implementation of NeurIPS 2021 paper: Finding Optimal Tangent Points for Reducing Distortions of Hard-label Attacks

PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.

Reinforcement Learning Theory Book (rus)

A multi-scale unsupervised learning for deformable image registration

Neural Reprojection Error: Merging Feature Learning and Camera Pose Estimation

The PyTorch improved version of TPAMI 2017 paper: Face Alignment in Full Pose Range: A 3D Total Solution.

[NeurIPS'21] "AugMax: Adversarial Composition of Random Augmentations for Robust Training" by Haotao Wang, Chaowei Xiao, Jean Kossaifi, Zhiding Yu, Animashree Anandkumar, and Zhangyang Wang.

Code & Data for Enhancing Photorealism Enhancement

Probabilistic-Monocular-3D-Human-Pose-Estimation-with-Normalizing-Flows

A PyTorch implementation of the baseline method in Panoptic Narrative Grounding (ICCV 2021 Oral)

Pyramid addon for OpenAPI3 validation of requests and responses.

⚡️Optimizing einsum functions in NumPy, Tensorflow, Dask, and more with contraction order optimization.

Individual Tree Crown classification on WorldView-2 Images using Autoencoder -- Group 9 Weak learners - Final Project (Machine Learning 2020 Course)

Point Cloud Registration Network

Repository for paper "Non-intrusive speech intelligibility prediction from discrete latent representations"

Patch2Pix: Epipolar-Guided Pixel-Level Correspondences [CVPR2021]

Advanced Deep Learning with TensorFlow 2 and Keras (Updated for 2nd Edition)