Official implementation of "Membership Inference Attacks Against Self-supervised Speech Models"

Overview

Introduction

Official implementation of "Membership Inference Attacks Against Self-supervised Speech Models".

In this work, we demonstrate that existing self-supervised speech model such as HuBERT, wav2vec 2.0, CPC and TERA are vulnerable to membership inference attack (MIA) and thus could reveal sensitive informations related to the training data.

Requirements

  1. Python >= 3.6
  2. Install sox on your OS
  3. Install s3prl on your OS
git clone https://github.com/s3prl/s3prl
cd s3prl
pip install -e ./
  1. Install the specific fairseq
pip install [email protected]+https://github.com//pytorch/[email protected]#egg=fairseq

Preprocessing

First, extract the self-supervised feature of utterances in each corpus according to your needs.

Currently, only LibriSpeech is available.

BASE_PATH=/path/of/the/corpus
OUTPUT_PATH=/path/to/save/feature
MODEL=wav2vec2
SPLIT=train-clean-100 # you should extract train-clean-100, dev-clean, dev-other, test-clean, test-other

python preprocess_feature_LibriSpeech.py \
    --base_path $BATH_PATH \
    --output_path $OUTPUT_PATH \
    --model $MODEL \
    --split $SPLIT

Speaker-level MIA

After extracting the features, you can apply the attack against the models using either basic attack and improved attack.

Noted that you should run the basic attack to generate the .csv file with similarity scores before performing improved attack.

Basic Attack

SEEN_BASE_PATH=/path/you/save/feature/of/seen/corpus
UNSEEN_BASE_PATH=/path/you/save/feature/of/unseen/corpus
OUTPUT_PATH=/path/to/output/results
MODEL=wav2vec2

python predefined-speaker-level-MIA.py \
    --seen_base_path $SEEN_BATH_PATH \
    --unseen_base_path $UNSEEN_BATH_PATH \
    --output_path $OUTPUT_PATH \
    --model $MODEL \

Improved Attack

python train-speaker-level-similarity-model.py \
    --seen_base_path $UNSEEN_BATH_PATH \
    --output_path $OUTPUT_PATH \
    --model $MODEL \
    --speaker_list "${OUTPUT_PATH}/${MODEL}-customized-speaker-level-attack-similarity.csv"

python customized-speaker-level-MIA.py \
    --seen_base_path $SEEN_BATH_PATH \
    --unseen_base_path $UNSEEN_BATH_PATH \
    --output_path $OUTPUT_PATH \
    --model $MODEL \
    --similarity_model_path "${OUTPUT_PATH}/customized-speaker-similarity-model-${MODEL}.pt"

Utterance-level MIA

The process for utterance-level MIA is similar to that of speaker-level:

Basic Attack

SEEN_BASE_PATH=/path/you/save/feature/of/seen/corpus
UNSEEN_BASE_PATH=/path/you/save/feature/of/unseen/corpus
OUTPUT_PATH=/path/to/output/results
MODEL=wav2vec2

python predefined-utterance-level-MIA.py \
    --seen_base_path $SEEN_BATH_PATH \
    --unseen_base_path $UNSEEN_BATH_PATH \
    --output_path $OUTPUT_PATH \
    --model $MODEL \

Improved Attack

python train-utterance-level-similarity-model.py \
    --seen_base_path $UNSEEN_BATH_PATH \
    --output_path $OUTPUT_PATH \
    --model $MODEL \
    --speaker_list "${OUTPUT_PATH}/${MODEL}-customized-utterance-level-attack-similarity.csv"

python customized-utterance-level-MIA.py \
    --seen_base_path $SEEN_BATH_PATH \
    --unseen_base_path $UNSEEN_BATH_PATH \
    --output_path $OUTPUT_PATH \
    --model $MODEL \
    --similarity_model_path "${OUTPUT_PATH}/customized-utterance-similarity-model-${MODEL}.pt"

Citation

If you find our work useful, please cite:

Owner
Wei-Cheng Tseng
Wei-Cheng Tseng
Semi-Supervised Signed Clustering Graph Neural Network (and Implementation of Some Spectral Methods)

SSSNET SSSNET: Semi-Supervised Signed Network Clustering For details, please read our paper. Environment Setup Overview The project has been tested on

Yixuan He 9 Nov 24, 2022
Paper Title: Heterogeneous Knowledge Distillation for Simultaneous Infrared-Visible Image Fusion and Super-Resolution

HKDnet Paper Title: "Heterogeneous Knowledge Distillation for Simultaneous Infrared-Visible Image Fusion and Super-Resolution" Email:

wasteland 11 Nov 12, 2022
Supervised 3D Pre-training on Large-scale 2D Natural Image Datasets for 3D Medical Image Analysis

Introduction This is an implementation of our paper Supervised 3D Pre-training on Large-scale 2D Natural Image Datasets for 3D Medical Image Analysis.

24 Dec 06, 2022
Python Rapid Artificial Intelligence Ab Initio Molecular Dynamics

Python Rapid Artificial Intelligence Ab Initio Molecular Dynamics

14 Nov 06, 2022
Pytorch implementation of BRECQ, ICLR 2021

BRECQ Pytorch implementation of BRECQ, ICLR 2021 @inproceedings{ li&gong2021brecq, title={BRECQ: Pushing the Limit of Post-Training Quantization by Bl

Yuhang Li 148 Dec 28, 2022
[SIGGRAPH 2022 Journal Track] AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars

AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars Fangzhou Hong1*  Mingyuan Zhang1*  Liang Pan1  Zhongang Cai1,2,3  Lei Yang2 

Fangzhou Hong 749 Jan 04, 2023
DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort

DatasetGAN This is the official code and data release for: DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort Yuxuan Zhang*, Huan Li

302 Jan 05, 2023
Direct Multi-view Multi-person 3D Human Pose Estimation

Implementation of NeurIPS-2021 paper: Direct Multi-view Multi-person 3D Human Pose Estimation [paper] [video-YouTube, video-Bilibili] [slides] This is

Sea AI Lab 251 Dec 30, 2022
A Deep Learning based project for creating line art portraits.

ArtLine The main aim of the project is to create amazing line art portraits. Sounds Intresting,let's get to the pictures!! Model-(Smooth) Model-(Quali

Vijish Madhavan 3.3k Jan 07, 2023
NeurIPS workshop paper 'Counter-Strike Deathmatch with Large-Scale Behavioural Cloning'

Counter-Strike Deathmatch with Large-Scale Behavioural Cloning Tim Pearce, Jun Zhu Offline RL workshop, NeurIPS 2021 Paper: https://arxiv.org/abs/2104

Tim Pearce 169 Dec 26, 2022
A program that can analyze videos according to the weights you select

MaskMonitor A program that can analyze videos according to the weights you select 下載 訓練完的 weight檔案 執行 MaskDetection.py 內部可更改 輸入來源(鏡頭, 影片, 圖片) 以及輸出條件(人

Patrick_star 1 Nov 07, 2021
Shitty gaze mouse controller

demo.mp4 shitty_gaze_mouse_cotroller install tensofflow, cv2 run the main.py and as it starts it will collect data so first raise your left eyebrow(bo

16 Aug 30, 2022
Agent-based model simulator for air quality and pandemic risk assessment in architectural spaces

Agent-based model simulation for air quality and pandemic risk assessment in architectural spaces. User Guide archABM is a fast and open source agent-

Vicomtech 10 Dec 05, 2022
SMCA replication There are no extra compiled components in SMCA DETR and package dependencies are minimal

Usage There are no extra compiled components in SMCA DETR and package dependencies are minimal, so the code is very simple to use. We provide instruct

22 May 06, 2022
PyTorch code for training MM-DistillNet for multimodal knowledge distillation

There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge MM-DistillNet is a

51 Dec 20, 2022
Asymmetric Bilateral Motion Estimation for Video Frame Interpolation, ICCV2021

ABME (ICCV2021) Junheum Park, Chul Lee, and Chang-Su Kim Official PyTorch Code for "Asymmetric Bilateral Motion Estimation for Video Frame Interpolati

Junheum Park 86 Dec 28, 2022
Classification of Long Sequential Data using Circular Dilated Convolutional Neural Networks

Classification of Long Sequential Data using Circular Dilated Convolutional Neural Networks arXiv preprint: https://arxiv.org/abs/2201.02143. Architec

19 Nov 30, 2022
PyTorch3D is FAIR's library of reusable components for deep learning with 3D data

Introduction PyTorch3D provides efficient, reusable components for 3D Computer Vision research with PyTorch. Key features include: Data structure for

Facebook Research 6.8k Jan 01, 2023
Implementation of "Meta-rPPG: Remote Heart Rate Estimation Using a Transductive Meta-Learner"

Meta-rPPG: Remote Heart Rate Estimation Using a Transductive Meta-Learner This repository is the official implementation of Meta-rPPG: Remote Heart Ra

Eugene Lee 137 Dec 13, 2022
Official code for ICCV2021 paper "M3D-VTON: A Monocular-to-3D Virtual Try-on Network"

M3D-VTON: A Monocular-to-3D Virtual Try-On Network Official code for ICCV2021 paper "M3D-VTON: A Monocular-to-3D Virtual Try-on Network" Paper | Suppl

109 Dec 29, 2022