Differential Privacy for Heterogeneous Federated Learning : Utility & Privacy tradeoffs

Overview

Differential Privacy for Heterogeneous Federated Learning : Utility & Privacy tradeoffs

In this work, we propose an algorithm DP-SCAFFOLD(-warm), which is a new version of the so-called SCAFFOLD algorithm ( warm version : wise initialisation of parameters), to tackle heterogeneity issues under mathematical privacy constraints known as Differential Privacy (DP) in a federated learning framework. Using fine results of DP theory, we have succeeded in establishing both privacy and utility guarantees, which show the superiority of DP-SCAFFOLD over the naive algorithm DP-FedAvg. We here provide numerical experiments that confirm our analysis and prove the significance of gains of DP-SCAFFOLD especially when the number of local updates or the level of heterogeneity between users grows.

Two datasets are studied:

  • a real-world dataset called Femnist (an extended version of EMNIST dataset for federated learning), which you see the Accuracy growing with the number of communication rounds (50 local updates first and then 100 local updates)

image_femnist image_femnist

  • synthetic data called Logistic for logistic regression models, which you see the train loss decreasing with the number of communication rounds (50 local updates first and then 100 local updates),

image_logistic image_logistic

Significant results are available for both of these datasets for logistic regression models.

Structure of the code

  • main.py: four global options are available.
    • generate: to generate data, introduce heterogeneity, split data between users for federated learning and preprocess data
    • optimum (after generate): to run a phase training with unsplitted data and save the "best" empirical model in a centralized setting to properly compare rates of convergence
    • simulation (after generate and optimum): to run several simulations of federated learning and save the results (accuracy, loss...)
    • plot (after simulation): to plot visuals

./data

Contains generators of synthetic (Logistic) and real-world (Femnist) data ( file data_generator.py), designed for a federated learning framework under some similarity parameter. Each folder contains a file data where the generated data (train and test) is stored.

./flearn

  • differential_privacy : contains code to apply Gaussian mechanism (designed to add differential privacy to mini-batch stochastic gradients)
  • optimizers : contains the optimization framework for each algorithm (adaptation of stochastic gradient descent)
  • servers : contains the super class Server (in server_base.py) which is adapted to FedAvg and SCAFFOLD (algorithm from the point of view of the server)
  • trainmodel : contains the learning model structures
  • users : contains the super class User (in user_base.py) which is adapted to FedAvg and SCAFFOLD ( algorithm from the point of view of any user)

./models

Stores the latest models over the training phase of federated learning.

./results

Stores several metrics of convergence for each simulation, each similarity/privacy setting and each algorithm.

Metrics (evaluated at each round of communication):

  • test accuracy over all users,
  • train loss over all users,
  • highest norm of parameter difference (server/user) over all selected users,
  • train gradient dissimilarity over all users.

Software requirements:

  • To download the dependencies: pip install -r requirements.txt

References

This repository contains the segmentation user interface from the OpenSurfaces project, extracted as a lightweight tool

OpenSurfaces Segmentation UI This repository contains the segmentation user interface from the OpenSurfaces project, extracted as a lightweight tool.

Sean Bell 66 Jul 11, 2022
Website for D2C paper

D2C This is the repository that contains source code for the D2C Website. If you find D2C useful for your work please cite: @article{sinha2021d2c au

1 Oct 21, 2021
Implementation of PersonaGPT Dialog Model

PersonaGPT An open-domain conversational agent with many personalities PersonaGPT is an open-domain conversational agent cpable of decoding personaliz

ILLIDAN Lab 42 Jan 01, 2023
The implementation our EMNLP 2021 paper "Enhanced Language Representation with Label Knowledge for Span Extraction".

LEAR The implementation our EMNLP 2021 paper "Enhanced Language Representation with Label Knowledge for Span Extraction". See below for an overview of

杨攀 93 Jan 07, 2023
Online Pseudo Label Generation by Hierarchical Cluster Dynamics for Adaptive Person Re-identification

Online Pseudo Label Generation by Hierarchical Cluster Dynamics for Adaptive Person Re-identification

TANG, shixiang 6 Nov 25, 2022
Tensorflow AffordanceNet and AffContext implementations

AffordanceNet and AffContext This is tensorflow AffordanceNet and AffContext implementations. Both are implemented and tested with tensorflow 2.3. The

Beatriz Pérez 6 Dec 01, 2022
A library for optimization on Riemannian manifolds

TensorFlow RiemOpt A library for manifold-constrained optimization in TensorFlow. Installation To install the latest development version from GitHub:

Oleg Smirnov 83 Dec 27, 2022
A proof of concept ai-powered Recaptcha v2 solver

Recaptcha Fullauto I've decided to open source my old Recaptcha v2 solver. My latest version will be opened sourced this summer. I am hoping this proj

Nate 60 Dec 20, 2022
g9.py - Torch interactive graphics

g9.py - Torch interactive graphics A Torch toy in the browser. Demo at https://srush.github.io/g9py/ This is a shameless copy of g9.js, written in Pyt

Sasha Rush 13 Nov 16, 2022
Collapse by Conditioning: Training Class-conditional GANs with Limited Data

Collapse by Conditioning: Training Class-conditional GANs with Limited Data Moha

Mohamad Shahbazi 33 Dec 06, 2022
A simple rest api that classifies pneumonia infection weather it is Normal, Pneumonia Virus or Pneumonia Bacteria from a chest-x-ray image.

This is a simple rest api that classifies pneumonia infection weather it is Normal, Pneumonia Virus or Pneumonia Bacteria from a chest-x-ray image.

crispengari 3 Jan 08, 2022
FactSeg: Foreground Activation Driven Small Object Semantic Segmentation in Large-Scale Remote Sensing Imagery (TGRS)

FactSeg: Foreground Activation Driven Small Object Semantic Segmentation in Large-Scale Remote Sensing Imagery by Ailong Ma, Junjue Wang*, Yanfei Zhon

Kingdrone 43 Jan 05, 2023
Domain Generalization for Mammography Detection via Multi-style and Multi-view Contrastive Learning

MSVCL_MICCAI2021 Installation Please follow the instruction in pytorch-CycleGAN-and-pix2pix to install. Example Usage An example of vendor-styles tran

Jaron Lee 11 Oct 19, 2022
PyTorch implementation for the visual prior component (i.e. perception module) of the Visually Grounded Physics Learner [Li et al., 2020].

VGPL-Visual-Prior PyTorch implementation for the visual prior component (i.e. perception module) of the Visually Grounded Physics Learner (VGPL). Give

Toru 8 Dec 29, 2022
Rewrite ultralytics/yolov5 v6.0 opencv inference code based on numpy, no need to rely on pytorch

Rewrite ultralytics/yolov5 v6.0 opencv inference code based on numpy, no need to rely on pytorch; pre-processing and post-processing using numpy instead of pytroch.

炼丹去了 21 Dec 12, 2022
Source code to accompany Defunctland's video "FASTPASS: A Complicated Legacy"

Shapeland Simulator Source code to accompany Defunctland's video "FASTPASS: A Complicated Legacy" Download the video at https://www.youtube.com/watch?

TouringPlans.com 70 Dec 14, 2022
Immortal tracker

Immortal_tracker Prerequisite Our code is tested for Python 3.6. To install required liabraries: pip install -r requirements.txt Waymo Open Dataset P

74 Dec 03, 2022
Fair Recommendation in Two-Sided Platforms

Fair Recommendation in Two-Sided Platforms

gourabgggg 1 Nov 10, 2021
Rainbow: Combining Improvements in Deep Reinforcement Learning

Rainbow Rainbow: Combining Improvements in Deep Reinforcement Learning [1]. Results and pretrained models can be found in the releases. DQN [2] Double

Kai Arulkumaran 1.4k Dec 29, 2022
Chinese clinical named entity recognition using pre-trained BERT model

Chinese clinical named entity recognition (CNER) using pre-trained BERT model Introduction Code for paper Chinese clinical named entity recognition wi

Xiangyang Li 109 Dec 14, 2022