Privacy-Preserving Machine Learning (PPML) Tutorial Presented at PyConDE 2022

Overview

PPML: Machine Learning on Data you cannot see

Repository for the tutorial on Privacy-Preserving Machine Learning (PPML) presented at PyConDE 2022

Abstract

Privacy guarantees are one of the most crucial requirements when it comes to analyse sensitive information. However, data anonymisation techniques alone do not always provide complete privacy protection; moreover Machine Learning (ML) models could also be exploited to leak sensitive data when attacked and no counter-measure is put in place.

Privacy-preserving machine learning (PPML) methods hold the promise to overcome all those issues, allowing to train machine learning models with full privacy guarantees.

This workshop will be mainly organised in two parts. In the first part, we will explore one example of ML model exploitation (i.e. inference attack ) to reconstruct original data from a trained model, and we will then see how differential privacy can help us protecting the privacy of our model, with minimum disruption to the original pipeline. In the second part of the workshop, we will examine a more complicated ML scenario to train Deep learning networks on encrypted data, with specialised distributed federated learning strategies.

Outline

  • Introduction: Brief Intro to PPML and to the workshop (slides)

  • Part 1: Strengthening Deep Neural Networks

    • Model vulnerabilities:
    • Deep Learning with Differential Privacy
  • Part 2: Primer on Privacy-Preserving Machine Learning

Note: the material has been updated after the conference, to match the flow of the presentation as delivered during the conference, as well as to incorporate feedbacks received afterwards.

"PyConDE Logo" Video recording of the session presented at PyCon DE

Get the material

Clone the current repository, in order to get the course materials. To do so, once connected to your remote machine (via SSH), execute the following instructions:

cd $HOME  # This will make sure you'll be in your HOME folder
git clone https://github.com/leriomaggio/ppml-pyconde.git

Note: This will create a new folder named ppml-pyconde. Move into this folder by typing:

cd ppml-pyconde

Well done! Now you should do be in the right location. Bear with me another few seconds, following instructions reported below 🙏

Set up your Environment

To execute the notebooks in this repository, it is necessary to set up the environment.

Please refer to the Get-Ready.ipynb notebook for a step-by-step guide on how to setup the environment, and check that all is working, and ready to go.

Note: You could run this notebook directly in VSCode, or in your existing Jupyter notebook/lab environment:

jupyter notebook Get-Ready.ipynb

Colophon

Author: Valerio Maggio (@leriomaggio), Senior Research Associate, University of Bristol.

All the Code material is distributed under the terms of the Apache License. See LICENSE file for additional details.

All the instructional materials in this repository are free to use, and made available under the [Creative Commons Attribution license][https://creativecommons.org/licenses/by/4.0/]. The following is a human-readable summary of (and not a substitute for) the full legal text of the CC BY 4.0 license.

You are free:

  • to Share---copy and redistribute the material in any medium or format
  • to Adapt---remix, transform, and build upon the material

for any purpose, even commercially.

The licensor cannot revoke these freedoms as long as you follow the license terms.

Under the following terms:

  • Attribution---You must give appropriate credit (mentioning that your work is derived from work that is Copyright © Software Carpentry and, where practical, linking to http://software-carpentry.org/), provide a [link to the license][cc-by-human], and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.

No additional restrictions---You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.

Acknowledgment and funding

The material developed in this tutorial has been supported by the University of Bristol, and by the Software Sustainability Institute (SSI), as part of my SSI fellowship on PETs (Privacy Enchancing Technologies).

Please see this deck to know more about my fellowship plans.

I would also like to thank all the people at OpenMined for all the encouragement and support with the preparation of this tutorial. I hope the material in this repository could contribute to raise awareness about all the amazing work on PETs it's being provided to the Open Source and the Python communities.

SSI Logo UoB Logo OpenMined

Contacts

For any questions or doubts, feel free to open an issue in the repository, or drop me an email @ valerio.maggio_at_gmail_dot_com

You might also like...
Official implementation of GraphMask as presented in our paper Interpreting Graph Neural Networks for NLP With Differentiable Edge Masking.

GraphMask This repository contains an implementation of GraphMask, the interpretability technique for graph neural networks presented in our ICLR 2021

Official implementation of the network presented in the paper
Official implementation of the network presented in the paper "M4Depth: A motion-based approach for monocular depth estimation on video sequences"

M4Depth This is the reference TensorFlow implementation for training and testing depth estimation models using the method described in M4Depth: A moti

Code for the Population-Based Bandits Algorithm, presented at NeurIPS 2020.

Population-Based Bandits (PB2) Code for the Population-Based Bandits (PB2) Algorithm, from the paper Provably Efficient Online Hyperparameter Optimiza

Projects for AI/ML and IoT integration for games and other presented at re:Invent 2021.
Projects for AI/ML and IoT integration for games and other presented at re:Invent 2021.

Playground4AWS Projects for AI/ML and IoT integration for games and other presented at re:Invent 2021. Architecture Minecraft and Lamps This project i

Prototypical python implementation of the trust-region algorithm presented in Sequential Linearization Method for Bound-Constrained Mathematical Programs with Complementarity Constraints by Larson, Leyffer, Kirches, and Manns.

Prototypical python implementation of the trust-region algorithm presented in Sequential Linearization Method for Bound-Constrained Mathematical Programs with Complementarity Constraints by Larson, Leyffer, Kirches, and Manns.

Collection of TensorFlow2 implementations of Generative Adversarial Network varieties presented in research papers.
Collection of TensorFlow2 implementations of Generative Adversarial Network varieties presented in research papers.

TensorFlow2-GAN Collection of tf2.0 implementations of Generative Adversarial Network varieties presented in research papers. Model architectures will

Tensorflow implementation of the paper "HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences", CVPR 2021.

HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences Tensorflow implementation of the paper "HumanGPS: Geodesic PreServing Feature fo

clDice - a Novel Topology-Preserving Loss Function for Tubular Structure Segmentation
clDice - a Novel Topology-Preserving Loss Function for Tubular Structure Segmentation

README clDice - a Novel Topology-Preserving Loss Function for Tubular Structure Segmentation CVPR 2021 Authors: Suprosanna Shit and Johannes C. Paetzo

A Python implementation of the Locality Preserving Matching (LPM) method for pruning outliers in image matching.

LPM_Python A Python implementation of the Locality Preserving Matching (LPM) method for pruning outliers in image matching. The code is established ac

Releases(pyconde)
  • pyconde(Jun 14, 2022)

    Tutorial on Privacy-Preserving Machine Learning as presented at PyCon DE 2022 (https://2022.pycon.de/program/QHJ7SX/)

    Full Changelog: https://github.com/leriomaggio/ppml-tutorial/commits/pyconde

    Source code(tar.gz)
    Source code(zip)
Owner
Valerio Maggio
Data Scientist and Researcher @DynamicGenetics
Valerio Maggio
Generating retro pixel game characters with Generative Adversarial Networks. Dataset "TinyHero" included.

pixel_character_generator Generating retro pixel game characters with Generative Adversarial Networks. Dataset "TinyHero" included. Dataset TinyHero D

Agnieszka Mikołajczyk 88 Nov 17, 2022
Open-Set Recognition: A Good Closed-Set Classifier is All You Need

Open-Set Recognition: A Good Closed-Set Classifier is All You Need Code for our paper: "Open-Set Recognition: A Good Closed-Set Classifier is All You

194 Jan 03, 2023
When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Dataset of 53,000+ Legal Holdings

When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Dataset of 53,000+ Legal Holdings This is the repository for t

RegLab 39 Jan 07, 2023
HyperSeg: Patch-wise Hypernetwork for Real-time Semantic Segmentation Official PyTorch Implementation

: We present a novel, real-time, semantic segmentation network in which the encoder both encodes and generates the parameters (weights) of the decoder. Furthermore, to allow maximal adaptivity, the w

Yuval Nirkin 182 Dec 14, 2022
Autotype on websites that have copy-paste disabled like Moodle, HackerEarth contest etc.

Autotype A quick and small python script that helps you autotype on websites that have copy paste disabled like Moodle, HackerEarth contests etc as it

Tushar 32 Nov 03, 2022
Code release for DS-NeRF (Depth-supervised Neural Radiance Fields)

Depth-supervised NeRF: Fewer Views and Faster Training for Free Project | Paper | YouTube Pytorch implementation of our method for learning neural rad

524 Jan 08, 2023
The trained model and denoising example for paper : Cardiopulmonary Auscultation Enhancement with a Two-Stage Noise Cancellation Approach

The trained model and denoising example for paper : Cardiopulmonary Auscultation Enhancement with a Two-Stage Noise Cancellation Approach

ycj_project 1 Jan 18, 2022
Establishing Strong Baselines for TripClick Health Retrieval; ECIR 2022

TripClick Baselines with Improved Training Data Welcome 🙌 to the hub-repo of our paper: Establishing Strong Baselines for TripClick Health Retrieval

Sebastian Hofstätter 3 Nov 03, 2022
This is the code for "HyperNeRF: A Higher-Dimensional Representation for Topologically Varying Neural Radiance Fields".

HyperNeRF: A Higher-Dimensional Representation for Topologically Varying Neural Radiance Fields This is the code for "HyperNeRF: A Higher-Dimensional

Google 702 Jan 02, 2023
This is the pytorch implementation for the paper: Generalizable Mixed-Precision Quantization via Attribution Rank Preservation, which is accepted to ICCV2021.

GMPQ: Generalizable Mixed-Precision Quantization via Attribution Rank Preservation This is the pytorch implementation for the paper: Generalizable Mix

18 Sep 02, 2022
Code for the bachelors-thesis flaky fault localization

Flaky_Fault_Localization Scripts for the Bachelors-Thesis: "Flaky Fault Localization" by Christian Kasberger. The thesis examines the usefulness of sp

Christian Kasberger 1 Oct 26, 2021
The official implementation of VAENAR-TTS, a VAE based non-autoregressive TTS model.

VAENAR-TTS This repo contains code accompanying the paper "VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis". Sa

THUHCSI 138 Oct 28, 2022
Sound and Cost-effective Fuzzing of Stripped Binaries by Incremental and Stochastic Rewriting

StochFuzz: A New Solution for Binary-only Fuzzing StochFuzz is a (probabilistically) sound and cost-effective fuzzing technique for stripped binaries.

Zhuo Zhang 164 Dec 05, 2022
Hydra Lightning Template for Structured Configs

Hydra Lightning Template for Structured Configs Template for creating projects with pytorch-lightning and hydra. How to use this template? Create your

Model-driven Machine Learning 4 Jul 19, 2022
Learning Efficient Online 3D Bin Packing on Packing Configuration Trees

Learning Efficient Online 3D Bin Packing on Packing Configuration Trees This repository is being continuously updated, please stay tuned! Any code con

86 Dec 28, 2022
REBEL: Relation Extraction By End-to-end Language generation

REBEL: Relation Extraction By End-to-end Language generation This is the repository for the Findings of EMNLP 2021 paper REBEL: Relation Extraction By

Babelscape 222 Jan 06, 2023
Image Segmentation using U-Net, U-Net with skip connections and M-Net architectures

Brain-Image-Segmentation Segmentation of brain tissues in MRI image has a number of applications in diagnosis, surgical planning, and treatment of bra

Angad Bajwa 8 Oct 27, 2022
i3DMM: Deep Implicit 3D Morphable Model of Human Heads

i3DMM: Deep Implicit 3D Morphable Model of Human Heads CVPR 2021 (Oral) Arxiv | Poject Page This project is the official implementation our work, i3DM

Tarun Yenamandra 60 Jan 03, 2023
FactSeg: Foreground Activation Driven Small Object Semantic Segmentation in Large-Scale Remote Sensing Imagery (TGRS)

FactSeg: Foreground Activation Driven Small Object Semantic Segmentation in Large-Scale Remote Sensing Imagery by Ailong Ma, Junjue Wang*, Yanfei Zhon

Kingdrone 43 Jan 05, 2023
Learn other languages ​​using artificial intelligence with python.

The main idea of ​​the project is to facilitate the learning of other languages. We created a simple AI that will interact with you. Just ask questions that if she knows, she will answer.

Pedro Rodrigues 2 Jun 07, 2022