Propose a principled and practically effective framework for unsupervised accuracy estimation and error detection tasks with theoretical analysis and state-of-the-art performance.

Overview

Detecting Errors and Estimating Accuracy on Unlabeled Data with Self-training Ensembles

This project is for the paper: Detecting Errors and Estimating Accuracy on Unlabeled Data with Self-training Ensembles.

Experimental Results

Main Results

Preliminaries

It is tested under Ubuntu Linux 16.04.1 and Python 3.6 environment, and requries some packages to be installed:

Downloading Datasets

  • MNIST-M: download it from the Google drive. Extract the files and place them in ./dataset/mnist_m/.
  • SVHN: need to download Format 2 data (*.mat). Place the files in ./dataset/svhn/.
  • USPS: download the usps.h5 file. Place the file in ./dataset/usps/.

Overview of the Code

  • train_model.py: train standard models via supervised learning.
  • train_dann.py: train domain adaptive (DANN) models.
  • eval_pipeline.py: evaluate various methods on all tasks.

Running Experiments

Examples

  • To train a standard model via supervised learning, you can use the following command:

python train_model.py --source-dataset {source dataset} --model-type {model type} --base-dir {directory to save the model}

{source dataset} can be mnist, mnist-m, svhn or usps.

{model type} can be typical_dnn or dann_arch.

  • To train a domain adaptive (DANN) model, you can use the following command:

python train_dann.py --source-dataset {source dataset} --target-dataset {target dataset} --base-dir {directory to save the model} [--test-time]

{source dataset} (or {target dataset}) can be mnist, mnist-m, svhn or usps.

The argument --test-time is to indicate whether to replace the target training dataset with the target test dataset.

  • To evaluate a method on all training-test dataset pairs, you can use the following command:

python eval_pipeline.py --model-type {model type} --method {method}

{model type} can be typical_dnn or dann_arch.

{method} can be conf_avg, ensemble_conf_avg, conf, trust_score, proxy_risk, our_ri or our_rm.

Train All Models

You can run the following scrips to pre-train all models needed for the experiments.

  • run_all_model_training.sh: train all supervised learning models.
  • run_all_dann_training.sh: train all DANN models.
  • run_all_ensemble_training.sh: train all ensemble models.

Evaluate All Methods

You can run the following script to get the results reported in the paper.

  • run_all_evaluation.sh: evaluate all methods on all tasks.

Acknowledgements

Part of this code is inspired by estimating-generalization and TrustScore.

Citation

Please cite our work if you use the codebase:

@article{chen2021detecting,
  title={Detecting Errors and Estimating Accuracy on Unlabeled Data with Self-training Ensembles},
  author={Chen, Jiefeng and Liu, Frederick and Avci, Besim and Wu, Xi and Liang, Yingyu and Jha, Somesh},
  journal={arXiv preprint arXiv:2106.15728},
  year={2021}
}

License

Please refer to the LICENSE.

Owner
Jiefeng Chen
Phd student at UW-Madision, working on trustworthy machine learning.
Jiefeng Chen
PyTorch implementation of CVPR 2020 paper (Reference-Based Sketch Image Colorization using Augmented-Self Reference and Dense Semantic Correspondence) and pre-trained model on ImageNet dataset

Reference-Based-Sketch-Image-Colorization-ImageNet This is a PyTorch implementation of CVPR 2020 paper (Reference-Based Sketch Image Colorization usin

Yuzhi ZHAO 11 Jul 28, 2022
Multi-robot collaborative exploration and mapping through Voronoi partition and DRL in unknown environment

Voronoi Multi_Robot Collaborate Exploration Introduction In the unknown environment, the cooperative exploration of multiple robots is completed by Vo

PeaceWord 6 Nov 22, 2022
Watch faces morph into each other with StyleGAN 2, StyleGAN, and DCGAN!

FaceMorpher FaceMorpher is an innovative project to get a unique face morph (or interpolation for geeks) on a website. Yes, this means you can see fac

Anish 9 Jun 24, 2022
CLIP (Contrastive Language–Image Pre-training) for Italian

Italian CLIP CLIP (Radford et al., 2021) is a multimodal model that can learn to represent images and text jointly in the same space. In this project,

Italian CLIP 114 Dec 29, 2022
Meta Self-learning for Multi-Source Domain Adaptation: A Benchmark

Meta Self-Learning for Multi-Source Domain Adaptation: A Benchmark Project | Arxiv | YouTube | | Abstract In recent years, deep learning-based methods

CVSM Group - email: <a href=[email protected]"> 188 Dec 12, 2022
Code for "Learning Canonical Representations for Scene Graph to Image Generation", Herzig & Bar et al., ECCV2020

Learning Canonical Representations for Scene Graph to Image Generation (ECCV 2020) Roei Herzig*, Amir Bar*, Huijuan Xu, Gal Chechik, Trevor Darrell, A

roei_herzig 24 Jul 07, 2022
Segmentation Training Pipeline

Segmentation Training Pipeline This package is a part of Musket ML framework. Reasons to use Segmentation Pipeline Segmentation Pipeline was developed

Musket ML 52 Dec 12, 2022
PyTorch Implementation of DSB for Score Based Generative Modeling. Experiments managed using Hydra.

Diffusion Schrödinger Bridge with Applications to Score-Based Generative Modeling This repository contains the implementation for the paper Diffusion

James Thornton 50 Jan 03, 2023
An implementation of the "Attention is all you need" paper without extra bells and whistles, or difficult syntax

Simple Transformer An implementation of the "Attention is all you need" paper without extra bells and whistles, or difficult syntax. Note: The only ex

29 Jun 16, 2022
Computer vision - fun segmentation experience using classic and deep tools :)

Computer_Vision_Segmentation_Fun Segmentation of Images and Video. Tools: pytorch Models: Classic model - GrabCut Deep model - Deeplabv3_resnet101 Flo

Mor Ventura 1 Dec 18, 2021
Code and data accompanying our SVRHM'21 paper.

Code and data accompanying our SVRHM'21 paper. Requires tensorflow 1.13, python 3.7, scikit-learn, and pytorch 1.6.0 to be installed. Python scripts i

5 Nov 17, 2021
Pytorch Implementation of "Diagonal Attention and Style-based GAN for Content-Style disentanglement in image generation and translation" (ICCV 2021)

DiagonalGAN Official Pytorch Implementation of "Diagonal Attention and Style-based GAN for Content-Style Disentanglement in Image Generation and Trans

32 Dec 06, 2022
Deep learning models for classification of 15 common weeds in the southern U.S. cotton production systems.

CottonWeeds Deep learning models for classification of 15 common weeds in the southern U.S. cotton production systems. requirements pytorch torchsumma

Dong Chen 8 Jun 07, 2022
Bilinear attention networks for visual question answering

Bilinear Attention Networks This repository is the implementation of Bilinear Attention Networks for the visual question answering and Flickr30k Entit

Jin-Hwa Kim 506 Nov 29, 2022
An Efficient Training Approach for Very Large Scale Face Recognition or F²C for simplicity.

Fast Face Classification (F²C) This is the code of our paper An Efficient Training Approach for Very Large Scale Face Recognition or F²C for simplicit

33 Jun 27, 2021
Fusion-in-Decoder Distilling Knowledge from Reader to Retriever for Question Answering

This repository contains code for: Fusion-in-Decoder models Distilling Knowledge from Reader to Retriever Dependencies Python 3 PyTorch (currently tes

Meta Research 323 Dec 19, 2022
190 Jan 03, 2023
Why Are You Weird? Infusing Interpretability in Isolation Forest for Anomaly Detection

Why, hello there! This is the supporting notebook for the research paper — Why Are You Weird? Infusing Interpretability in Isolation Forest for Anomal

2 Dec 14, 2021
Dataset and Code for the paper "DepthTrack: Unveiling the Power of RGBD Tracking" (ICCV2021), and "Depth-only Object Tracking" (BMVC2021)

DeT and DOT Code and datasets for "DepthTrack: Unveiling the Power of RGBD Tracking" (ICCV2021) "Depth-only Object Tracking" (BMVC2021) @InProceedings

Yan Song 55 Dec 15, 2022
Prefix-Tuning: Optimizing Continuous Prompts for Generation

Prefix Tuning Files: . ├── gpt2 # Code for GPT2 style autoregressive LM │ ├── train_e2e.py # high-level script

530 Jan 04, 2023