Adversarial Adaptation with Distillation for BERT Unsupervised Domain Adaptation

Last update: Nov 30, 2022

Overview

Knowledge Distillation for BERT Unsupervised Domain Adaptation

Official PyTorch implementation | Paper

Abstract

A pre-trained language model, BERT, has brought significant performance improvements across a range of natural language processing tasks. Since the model is trained on a large corpus of diverse topics, it shows robust performance for domain shift problems in which data distributions at training (source data) and testing (target data) differ while sharing similarities. Despite its great improvements compared to previous models, it still suffers from performance degradation due to domain shifts. To mitigate such problems, we propose a simple but effective unsupervised domain adaptation method, adversarial adaptation with distillation (AAD), which combines the adversarial discriminative domain adaptation (ADDA) framework with knowledge distillation. We evaluate our approach in the task of cross-domain sentiment classification on 30 domain pairs, advancing the state-of-the-art performance for unsupervised domain adaptation in text sentiment classification.

Requirements

pandas
pytorch
transformers

Run the test

$ python main.py --pretrain --adapt --src books --tgt dvd

How to cite

@article{ryu2020knowledge,
  title={Knowledge Distillation for BERT Unsupervised Domain Adaptation},
  author={Ryu, Minho and Lee, Kichun},
  journal={arXiv preprint arXiv:2010.11478},
  year={2020}
}

Adversarial Adaptation with Distillation for BERT Unsupervised Domain Adaptation

Related tags

Overview

Knowledge Distillation for BERT Unsupervised Domain Adaptation

Abstract

Requirements

Run the test

How to cite

Owner

Minho Ryu

Unsupervised Discovery of Object Radiance Fields

DvD-TD3: Diversity via Determinants for TD3 version

Implementation of accepted AAAI 2021 paper: Deep Unsupervised Image Hashing by Maximizing Bit Entropy

Real-time face detection and emotion/gender classification using fer2013/imdb datasets with a keras CNN model and openCV.

Tensorflow AffordanceNet and AffContext implementations

Usable Implementation of "Bootstrap Your Own Latent" self-supervised learning, from Deepmind, in Pytorch

Real life contra a deep learning project built using mediapipe and openc

CoaT: Co-Scale Conv-Attentional Image Transformers

You Only 👀 One Sequence

Equivariant Imaging: Learning Beyond the Range Space

A Pytorch implementation of "LegoNet: Efficient Convolutional Neural Networks with Lego Filters" (ICML 2019).

Implementation of "RaScaNet: Learning Tiny Models by Raster-Scanning Image" from CVPR 2021.

Code for "Unsupervised State Representation Learning in Atari"

Code and data of the EMNLP 2021 paper "Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer"

Molecular Sets (MOSES): A benchmarking platform for molecular generation models

Advantage Actor Critic (A2C): jax + flax implementation

Data stream analytics: Implement online learning methods to address concept drift in data streams using the River library. Code for the paper entitled "PWPAE: An Ensemble Framework for Concept Drift Adaptation in IoT Data Streams" accepted in IEEE GlobeCom 2021.

🤗 Push your spaCy pipelines to the Hugging Face Hub

Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021.

TICC is a python solver for efficiently segmenting and clustering a multivariate time series