Algorithmic encoding of protected characteristics and its implications on disparities across subgroups

Overview

Algorithmic encoding of protected characteristics and its implications on disparities across subgroups

Components of a deep neural networks

This repository contains the code for the paper

B. Glocker, S. Winzeck. Algorithmic encoding of protected characteristics and its implications on disparities across subgroups. 2021. under review. arXiv:2110.14755

Dataset

The CheXpert imaging dataset together with the patient demographic information used in this work can be downloaded from https://stanfordmlgroup.github.io/competitions/chexpert/.

Code

For running the code, we recommend setting up a dedicated Python environment.

Setup Python environment using conda

Create and activate a Python 3 conda environment:

conda create -n pymira python=3
conda activate chexploration

Install PyTorch using conda:

conda install pytorch torchvision cudatoolkit=10.1 -c pytorch

Setup Python environment using virtualenv

Create and activate a Python 3 virtual environment:

virtualenv -p python3 <path_to_envs>/chexploration
source <path_to_envs>/chexploration/bin/activate

Install PyTorch using pip:

pip install torch torchvision

Install additional Python packages:

pip install matplotlib jupyter pandas seaborn pytorch-lightning scikit-learn scikit-image tensorboard tqdm openpyxl

How to use

In order to replicate the results presented in the paper, please follow these steps:

  1. Download the CheXpert dataset, copy the file train.csv to the datafiles folder
  2. Download the CheXpert demographics data, copy the file CHEXPERT DEMO.xlsx to the datafiles folder
  3. Run the notebook chexpert.sample.ipynb to generate the study data
  4. Adjust the variable img_data_dir to point to the imaging data and run the following scripts
  5. Run the notebook chexpert.predictions.ipynb to evaluate all three prediction models
  6. Run the notebook chexpert.explorer.ipynb for the unsupervised exploration of feature representations

Additionally, there are scripts chexpert.sex.split.py and chexpert.race.split.py to run SPLIT on the disease detection model. The default setting in all scripts is to train a DenseNet-121 using the training data from all patients. The results for models trained on subgroups only can be produced by changing the path to the datafiles (e.g., using full_sample_train_white.csv and full_sample_val_white.csv instead of full_sample_train.csv and full_sample_val.csv).

Note, the Python scripts also contain code for running the experiments using a ResNet-34 backbone which requires less GPU memory.

Trained models

All trained models, feature embeddings and output predictions can be found here.

Funding sources

This work is supported through funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Grant Agreement No. 757173, Project MIRA, ERC-2017-STG) and by the UKRI London Medical Imaging & Artificial Intelligence Centre for Value Based Healthcare.

License

This project is licensed under the Apache License 2.0.

Owner
Team MIRA - BioMedIA
Team MIRA - BioMedIA
Official implementation of "Membership Inference Attacks Against Self-supervised Speech Models"

Introduction Official implementation of "Membership Inference Attacks Against Self-supervised Speech Models". In this work, we demonstrate that existi

Wei-Cheng Tseng 7 Nov 01, 2022
Cowsay - A rewrite of cowsay in python

Python Cowsay A rewrite of cowsay in python. Allows for parsing of existing .cow

James Ansley 3 Jun 27, 2022
Python codes for Lite Audio-Visual Speech Enhancement.

Lite Audio-Visual Speech Enhancement (Interspeech 2020) Introduction This is the PyTorch implementation of Lite Audio-Visual Speech Enhancement (LAVSE

Shang-Yi Chuang 85 Dec 01, 2022
Voxel Transformer for 3D object detection

Voxel Transformer This is a reproduced repo of Voxel Transformer for 3D object detection. The code is mainly based on OpenPCDet. Introduction We provi

173 Dec 25, 2022
Collect super-resolution related papers, data, repositories

Collect super-resolution related papers, data, repositories

WangChaofeng 1.7k Jan 03, 2023
mmdetection version of TinyBenchmark.

introduction This project is an mmdetection version of TinyBenchmark. TODO list: add TinyPerson dataset and evaluation add crop and merge for image du

34 Aug 27, 2022
Official implementation for "Style Transformer for Image Inversion and Editing" (CVPR 2022)

Style Transformer for Image Inversion and Editing (CVPR2022) https://arxiv.org/abs/2203.07932 Existing GAN inversion methods fail to provide latent co

Xueqi Hu 153 Dec 02, 2022
《Where am I looking at? Joint Location and Orientation Estimation by Cross-View Matching》(CVPR 2020)

This contains the codes for cross-view geo-localization method described in: Where am I looking at? Joint Location and Orientation Estimation by Cross-View Matching, CVPR2020.

41 Oct 27, 2022
Simple and Robust Loss Design for Multi-Label Learning with Missing Labels

Simple and Robust Loss Design for Multi-Label Learning with Missing Labels Official PyTorch Implementation of the paper Simple and Robust Loss Design

Xinyu Huang 28 Oct 27, 2022
Implementation of paper "Self-supervised Learning on Graphs:Deep Insights and New Directions"

SelfTask-GNN A PyTorch implementation of "Self-supervised Learning on Graphs: Deep Insights and New Directions". [paper] In this paper, we first deepe

Wei Jin 85 Oct 13, 2022
FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection

FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection arXi

59 Nov 29, 2022
PyExplainer: A Local Rule-Based Model-Agnostic Technique (Explainable AI)

PyExplainer PyExplainer is a local rule-based model-agnostic technique for generating explanations (i.e., why a commit is predicted as defective) of J

AI Wizards for Software Management (AWSM) Research Group 14 Nov 13, 2022
PlenOctrees: NeRF-SH Training & Conversion

PlenOctrees Official Repo: NeRF-SH training and conversion This repository contains code to train NeRF-SH and to extract the PlenOctree, constituting

Alex Yu 323 Dec 29, 2022
This provides the R code and data to replicate results in "The USS Trustee’s risky strategy"

USSBriefs2021 This provides the R code and data to replicate results in "The USS Trustee’s risky strategy" by Neil M Davies, Jackie Grant and Chin Yan

1 Oct 30, 2021
Lama-cleaner: Image inpainting tool powered by LaMa

Lama-cleaner: Image inpainting tool powered by LaMa

Qing 5.8k Jan 05, 2023
Hydra: an Extensible Fuzzing Framework for Finding Semantic Bugs in File Systems

Hydra: An Extensible Fuzzing Framework for Finding Semantic Bugs in File Systems Paper Finding Semantic Bugs in File Systems with an Extensible Fuzzin

gts3.org (<a href=[email protected])"> 129 Dec 15, 2022
Unrestricted Facial Geometry Reconstruction Using Image-to-Image Translation

Unrestricted Facial Geometry Reconstruction Using Image-to-Image Translation [Arxiv] [Video] Evaluation code for Unrestricted Facial Geometry Reconstr

Matan Sela 242 Dec 30, 2022
Code for paper PairRE: Knowledge Graph Embeddings via Paired Relation Vectors.

PairRE Code for paper PairRE: Knowledge Graph Embeddings via Paired Relation Vectors. This implementation of PairRE for Open Graph Benchmak datasets (

Alipay 65 Dec 19, 2022
Code for "Infinitely Deep Bayesian Neural Networks with Stochastic Differential Equations"

Infinitely Deep Bayesian Neural Networks with SDEs This library contains JAX and Pytorch implementations of neural ODEs and Bayesian layers for stocha

Winnie Xu 95 Nov 26, 2021
B2EA: An Evolutionary Algorithm Assisted by Two Bayesian Optimization Modules for Neural Architecture Search

B2EA: An Evolutionary Algorithm Assisted by Two Bayesian Optimization Modules for Neural Architecture Search This is the offical implementation of the

SNU ADSL 0 Feb 07, 2022