Bachelor's Thesis in Computer Science: Privacy-Preserving Federated Learning Applied to Decentralized Data

Last update: Nov 30, 2022

Overview

federated is the source code for the Bachelor's Thesis

Privacy-Preserving Federated Learning Applied to Decentralized Data (Spring 2021, NTNU)

Federated learning (also known as collaborative learning) is a machine learning technique that trains an algorithm across multiple decentralized edge devices or servers holding local data samples, without exchanging them. In this project, the decentralized data is the MIT-BIH Arrhythmia Database.

Features
Initial Setup
Installation
Running experiments with federated
Analyzing experiments with federated
- TensorBoard
- Jupyter Notebook
Documentation
Tests
How to Contribute
Owners

Features

ML pipelines using centralized learning or federated learning.
Support for the following aggregation methods:
- Federated Stochastic Gradient Descent (FedSGD)
- Federated Averaging (FedAvg)
- Differentially-Private Federated Averaging (DP-FedAvg)
- Federated Averaging with Homomorphic Encryption
- Robust Federated Aggregation (RFA)
Support for the following models:
- A simple softmax regressor
- A feed-forward neural network (ANN)
- A convolutional neural network (CNN)
Model compression in federated learning.

Installation

Prerequisites

Python 3.8
make
Docker 20.10 (optional)

Initial Setup

1. Cloning federated

$ git clone https://github.com/dilawarm/federated.git
$ cd federated

2. Getting the Dataset

To download the MIT-BIH Arrhythmia Database dataset used in this project, go to https://www.kaggle.com/shayanfazeli/heartbeat and download the files

mitbih_train.csv
mitbih_test.csv

Then write:

mkdir data
mkdir data/mitbih

and move the downloaded data into the data/mitbih folder.

Installing federated locally

1. Install the Python development environment

On Ubuntu:

$ sudo apt update
$ sudo apt install python3-dev python3-pip  # Python 3.8
$ sudo apt install build-essential          # make
$ sudo pip3 install --user --upgrade virtualenv

On macOS:

$ /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
$ export PATH="/usr/local/bin:/usr/local/sbin:$PATH"
$ brew update
$ brew install python  # Python 3.8
$ brew install make    # make
$ sudo pip3 install --user --upgrade virtualenv

2. Create a virtual environment

$ virtualenv --python python3 "venv"
$ source "venv/bin/activate"
(venv) $ pip install --upgrade pip

3. Install the dependencies

(venv) $ make install

4. Test TensorFlow Federated

(venv) $ python -c "import tensorflow_federated as tff; print(tff.federated_computation(lambda: 'Hello World')())"

Installing with Docker (optional)

Build and run image from Dockerfile

$ make docker

Running experiments with federated

federated has a client program, where one can initialize the different pipelines and train models with centralized or federated learning. To run this client program:

(venv) $ make help

This will display a list of options:

usage: python -m federated.main [-h] -l  -n  [-e] [-op] [-b] [-o] -m  [-lr]

Experimentation pipeline for federated 🚀

optional arguments:
  -b , --batch_size     The batch size. (default: 32)
  -e , --epochs         Number of global epochs. (default: 15)
  -h, --help            show this help message and exit
  -l , --learning_approach 
                        Learning apporach (centralized, federated). (default: None)
  -lr , --learning_rate 
                        Learning rate for server optimizer. (default: 1.0)
  -m , --model          The model to be trained with the learning approach (ann, softmax_regression, cnn). (default: None)
  -n , --experiment_name 
                        The name of the experiment. (default: None)
  -o , --output         Path to the output folder where the experiment is going to be saved. (default: history)
  -op , --optimizer     Server optimizer (adam, sgd). (default: sgd)

Here is an example on how to train a cnn model with federated learning for 10 global epochs using the SGD server-optimizer with a learning rate of 0.01:

(venv) $ python -m federated.main --learning_approach federated --model cnn --epochs 10 --optimizer sgd --learning_rate 0.01 --experiment_name experiment_name --output path/to/experiments

Running the command illustrated above, will display a list of input fields where one can fill in more information about the training configuration, such as aggregation method, if differential privacy should be used etc. Once all training configurations have been decided, the pipeline will be initialized. All logs and training configurations will be stored in the folder path/to/experiments/logdir/experiment_name.

Analyzing experiments with federated

TensorBoard

To analyze the results with TensorBoard:

(venv) $ tensorboard --logdir=path/to/experiments/logdir/experiment_name --port=6060

Jupyter Notebook

To analyze the results in the ModelAnalysis notebook, open the notebook with your editor. For example:

(venv) $ code notebooks/ModelAnalysis.ipynb

Replace the first line in this notebook with the absolute path to your experiment folder, and run the notebook to see the results.

Documentation

The documentation can be found here.

To generate the documentation locally:

(venv) $ cd docs
(venv) $ make html
(venv) $ firefox _build/html/index.html

Tests

The unit tests included in federated are:

Tests for data preprocessing
Tests for different machine learning models
Tests for the training loops
Tests for the different privacy algorithms such as RFA.

To run all the tests:

(venv) $ make tests

To generate coverage after running the tests:

(venv) $ coverage html
(venv) $ firefox htmlcov/index.html

See the Makefile for more commands to test the modules in federated separately.

How to Contribute

Clone repo and create a new branch:

$ git checkout https://github.com/dilawarm/federated.git -b name_for_new_branch

Make changes and test.
Submit Pull Request with comprehensive description of changes.

Owners

Pernille Kopperud	Dilawar Mahmood

Enjoy! 🙂

Comments

Replace Makefile with .sh

It's not necessary to install make to run the commands. The project should use a .sh file instead so that users do not have to install make (one less dependency).
enhancement

opened by dilawarm 0

Releases(v1.0)

v1.0(May 19, 2021)

See the ReadMe for the features.
Source code(tar.gz)
Source code(zip)

Politecnico of Turin Thesis: "Implementation and Evaluation of an Educational Chatbot based on NLP Techniques"

THESIS_CAIRONE_FIORENTINO Politecnico of Turin Thesis: "Implementation and Evaluation of an Educational Chatbot based on NLP Techniques" GENERATE TOKE

1 Dec 10, 2021

nnDetection is a self-configuring framework for 3D (volumetric) medical object detection which can be applied to new data sets without manual intervention. It includes guides for 12 data sets that were used to develop and evaluate the performance of the proposed method.

What is nnDetection? Simultaneous localisation and categorization of objects in medical images, also referred to as medical object detection, is of hi

365 Jan 9, 2023

We present a framework for training multi-modal deep learning models on unlabelled video data by forcing the network to learn invariances to transformations applied to both the audio and video streams.

Multi-Modal Self-Supervision using GDT and StiCa This is an official pytorch implementation of papers: Multi-modal Self-Supervision from Generalized D

42 Dec 9, 2022

Bachelor's Thesis in Computer Science: Privacy-Preserving Federated Learning Applied to Decentralized Data

Related tags

Overview

Table of Contents

Features

Installation

Prerequisites

Initial Setup

Installing federated locally

Installing with Docker (optional)

Running experiments with federated

Analyzing experiments with federated

TensorBoard

Jupyter Notebook

Documentation

Tests

How to Contribute

Owners

You might also like...

Politecnico of Turin Thesis: "Implementation and Evaluation of an Educational Chatbot based on NLP Techniques"

nnDetection is a self-configuring framework for 3D (volumetric) medical object detection which can be applied to new data sets without manual intervention. It includes guides for 12 data sets that were used to develop and evaluate the performance of the proposed method.

We present a framework for training multi-modal deep learning models on unlabelled video data by forcing the network to learn invariances to transformations applied to both the audio and video streams.

Deep Learning applied to Integral data analysis

Aalto-cs-msc-theses - Listing of M.Sc. Theses of the Department of Computer Science at Aalto University

Udacity's CS101: Intro to Computer Science - Building a Search Engine

The repository forked from NVlabs uses our data. (Differentiable rasterization applied to 3D model simplification tasks)

Decentralized Reinforcment Learning: Global Decision-Making via Local Economic Transactions (ICML 2020)

Code to go with the paper "Decentralized Bayesian Learning with Metropolis-Adjusted Hamiltonian Monte Carlo"

Comments

Replace Makefile with .sh

Releases(v1.0)

v1.0(May 19, 2021)

Owner

Dilawar Mahmood

MediaPipe Kullanarak İleri Seviye Bilgisayarla Görü

Provide partial dates and retain the date precision through processing

[NeurIPS2021] Exploring Architectural Ingredients of Adversarially Robust Deep Neural Networks

A Transformer-Based Siamese Network for Change Detection

KSAI Lite is a deep learning inference framework of kingsoft, based on tensorflow lite

Learn about quantum computing and algorithm on quantum computing

code for paper"A High-precision Semantic Segmentation Method Combining Adversarial Learning and Attention Mechanism"

🔥 Cogitare - A Modern, Fast, and Modular Deep Learning and Machine Learning framework for Python

Code for NeurIPS 2021 paper 'Spatio-Temporal Variational Gaussian Processes'

Code repository for the paper "Doubly-Trained Adversarial Data Augmentation for Neural Machine Translation" with instructions to reproduce the results.

Face Depixelizer based on "PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models" repository.

Semi-Supervised Learning with Ladder Networks in Keras. Get 98% test accuracy on MNIST with just 100 labeled examples !

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

pyhsmm - library for approximate unsupervised inference in Bayesian Hidden Markov Models (HMMs) and explicit-duration Hidden semi-Markov Models (HSMMs), focusing on the Bayesian Nonparametric extensions, the HDP-HMM and HDP-HSMM, mostly with weak-limit approximations.

Optical machine for senses sensing using speckle and deep learning

PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech

Differentiable architecture search for convolutional and recurrent networks

I will implement Fastai in each projects present in this repository.

Grow Function: Generate 3D Stacked Bifurcating Double Deep Cellular Automata based organisms which differentiate using a Genetic Algorithm...

graph-theoretic framework for robust pairwise data association