Code for the paper "Asymptotics of ℓ2 Regularized Network Embeddings"

Overview

README

Code for the paper Asymptotics of L2 Regularized Network Embeddings.

Requirements

Requires Stellargraph 1.2.1, Tensorflow 2.6.0, scikit-learm 0.24.1, tqdm, along with any other packages required for the above three packages.

Code

To run node classification or link prediction experiments, run

python -m code.train_embed [[args]]

or

python -m code.train_embed_link [[args]]

from the command line respectively, where [[args]] correspond to the command line arguments for each function. Note that the scripts expect to run from the parent directory of the code folder; you will need to change the import statements in the associated python files if you move them around. The -h command line argument will display the arguments (with descriptions) of each of the two files.

train_embed.py arguments

short long default help
-h --help show this help message and exit
--dataset Cora Dataset to perform training on. Available options: Cora,CiteSeer,PubMedDiabetes
--emb-size 128 Embedding dimension. Defaults to 128.
--reg-weight 0.0 Weight to use for L2 regularization. If norm_reg is True, then reg_weight/num_of_nodes is used instead.
--norm-reg Boolean for whether to normalize the L2 regularization weight by the number of nodes in the graph. Defaults to false.
--method node2vec Algorithm to perform training on. Available options: node2vec,GraphSAGE,GCN,DGI
--verbose 1 Level of verbosity. Defaults to 1.
--epochs 5 Number of epochs through the dataset to be used for training.
--optimizer Adam Optimization algorithm to use for training.
--learning-rate 0.001 Learning rate to use for optimization.
--batch-size 64 Batch size used for training.
--train-split [0.01, 0.025, 0.05] Percentage(s) to use for the training split when using the learned embeddings for downstream classification tasks.
--train-split-num 25 Decides the number of random training/test splits to use for evaluating performance. Defaults to 50.
--output-fname None If not None, saves the hyperparameters and testing results to a .json file with filename given by the argument.
--node2vec-p 1.0 Hyperparameter governing probability of returning to source node.
--node2vec-q 1.0 Hyperparameter governing probability of moving to a node away from the source node.
--node2vec-walk-number 50 Number of walks used to generate a sample for node2vec.
--node2vec-walk-length 5 Walk length to use for node2vec.
--dgi-sampler fullbatch Specifies either a fullbatch or a minibatch sampling scheme for DGI.
--gcn-activation ['relu'] Determines the activations of each layer within a GCN. Defaults to a single layer with relu activation.
--graphSAGE-aggregator mean Specifies the aggreagtion rule used in GraphSAGE. Defaults to mean pooling.
--graphSAGE-nbhd-sizes [10, 5] Specify multiple neighbourhood sizes for sampling in GraphSAGE. Defaults to [10, 5].
--tensorboard If toggles, saves Tensorboard logs for debugging purposes.
--visualize-embeds None If specified with a directory, saves an image of a TSNE 2D projection of the learned embeddings at the specified directory.
--save-spectrum None If specifies, saves the spectrum of the learned embeddings output by the algorithm.

train_embed_link.py arguments

short long default help
-h --help show this help message and exit
--dataset Cora Dataset to perform training on. Available options: Cora,CiteSeer,PubMedDiabetes
--emb-size 128 Embedding dimension. Defaults to 128.
--reg-weight 0.0 Weight to use for L2 regularization. If norm_reg is True, then reg_weight/num_of_nodes is used instead.
--norm-reg Boolean for whether to normalize the L2 regularization weight by the number of nodes in the graph. Defaults to false.
--method node2vec Algorithm to perform training on. Available options: node2vec,GraphSAGE,GCN,DGI
--verbose 1 Level of verbosity. Defaults to 1.
--epochs 5 Number of epochs through the dataset to be used for training.
--optimizer Adam Optimization algorithm to use for training.
--learning-rate 0.001 Learning rate to use for optimization.
--batch-size 64 Batch size used for training.
--test-split 0.1 Split of edge/non-edge set to be used for testing.
--output-fname None If not None, saves the hyperparameters and testing results to a .json file with filename given by the argument.
--node2vec-p 1.0 Hyperparameter governing probability of returning to source node.
--node2vec-q 1.0 Hyperparameter governing probability of moving to a node away from the source node.
--node2vec-walk-number 50 Number of walks used to generate a sample for node2vec.
--node2vec-walk-length 5 Walk length to use for node2vec.
--gcn-activation ['relu'] Specifies layers in terms of their output activation (either relu or linear), with the number of arguments determining the length of the GCN. Defaults to a single layer with relu activation.
--graphSAGE-aggregator mean Specifies the aggreagtion rule used in GraphSAGE. Defaults to mean pooling.
--graphSAGE-nbhd-sizes [10, 5] Specify multiple neighbourhood sizes for sampling in GraphSAGE. Defaults to [25, 10].
Owner
Andrew Davison
Andrew Davison
This is the implementation of "SELF SUPERVISED REPRESENTATION LEARNING WITH DEEP CLUSTERING FOR ACOUSTIC UNIT DISCOVERY FROM RAW SPEECH" submitted to ICASSP 2022

CPC_DeepCluster This is the implementation of "SELF SUPERVISED REPRESENTATION LEARNING WITH DEEP CLUSTERING FOR ACOUSTIC UNIT DISCOVERY FROM RAW SPEEC

LEAP Lab 2 Sep 15, 2022
AgML is a comprehensive library for agricultural machine learning

AgML is a comprehensive library for agricultural machine learning. Currently, AgML provides access to a wealth of public agricultural datasets for common agricultural deep learning tasks.

Plant AI and Biophysics Lab 1 Jul 07, 2022
The Malware Open-source Threat Intelligence Family dataset contains 3,095 disarmed PE malware samples from 454 families

MOTIF Dataset The Malware Open-source Threat Intelligence Family (MOTIF) dataset contains 3,095 disarmed PE malware samples from 454 families, labeled

Booz Allen Hamilton 112 Dec 13, 2022
MOpt-AFL provided by the paper "MOPT: Optimized Mutation Scheduling for Fuzzers"

MOpt-AFL 1. Description MOpt-AFL is a AFL-based fuzzer that utilizes a customized Particle Swarm Optimization (PSO) algorithm to find the optimal sele

172 Dec 18, 2022
An intuitive library to extract features from time series

Time Series Feature Extraction Library Intuitive time series feature extraction This repository hosts the TSFEL - Time Series Feature Extraction Libra

Associação Fraunhofer Portugal Research 589 Jan 04, 2023
This repository contains the code for the paper "PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization"

PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization News: [2020/05/04] Added EGL rendering option for training data g

Shunsuke Saito 1.5k Jan 03, 2023
Code to reproduce the results in the paper "Tensor Component Analysis for Interpreting the Latent Space of GANs".

Tensor Component Analysis for Interpreting the Latent Space of GANs [ paper | project page ] Code to reproduce the results in the paper "Tensor Compon

James Oldfield 4 Jun 17, 2022
As a part of the HAKE project, includes the reproduced SOTA models and the corresponding HAKE-enhanced versions (CVPR2020).

HAKE-Action HAKE-Action (TensorFlow) is a project to open the SOTA action understanding studies based on our Human Activity Knowledge Engine. It inclu

Yong-Lu Li 94 Nov 18, 2022
Learning to Draw: Emergent Communication through Sketching

Learning to Draw: Emergent Communication through Sketching This is the official code for the paper "Learning to Draw: Emergent Communication through S

19 Jul 22, 2022
Consistency Regularization for Adversarial Robustness

Consistency Regularization for Adversarial Robustness Official PyTorch implementation of Consistency Regularization for Adversarial Robustness by Jiho

40 Dec 17, 2022
PPO is a very popular Reinforcement Learning algorithm at present.

PPO is a very popular Reinforcement Learning algorithm at present. OpenAI takes PPO as the current baseline algorithm. We use the PPO algorithm to train a policy to give the best action in any situat

Rosefintech 11 Aug 23, 2021
A method that utilized Generative Adversarial Network (GAN) to interpret the black-box deep image classifier models by PyTorch.

A method that utilized Generative Adversarial Network (GAN) to interpret the black-box deep image classifier models by PyTorch.

Yunxia Zhao 3 Dec 29, 2022
Recovering Brain Structure Network Using Functional Connectivity

Recovering-Brain-Structure-Network-Using-Functional-Connectivity Framework: Papers: This repository provides a PyTorch implementation of the models ad

5 Nov 30, 2022
Optimizaciones incrementales al problema N-Body con el fin de evaluar y comparar las prestaciones de los traductores de Python en el ámbito de HPC.

Python HPC Optimizaciones incrementales de N-Body (all-pairs) con el fin de evaluar y comparar las prestaciones de los traductores de Python en el ámb

Andrés Milla 12 Aug 04, 2022
Implementation of the Transformer variant proposed in "Transformer Quality in Linear Time"

FLASH - Pytorch Implementation of the Transformer variant proposed in the paper Transformer Quality in Linear Time Install $ pip install FLASH-pytorch

Phil Wang 209 Dec 28, 2022
An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.

Speech Resynthesis from Discrete Disentangled Self-Supervised Representations Implementation of the method described in the Speech Resynthesis from Di

Facebook Research 253 Jan 06, 2023
PyQt6 configuration in yaml format providing the most simple script.

PyamlQt(ぴゃむるきゅーと) PyQt6 configuration in yaml format providing the most simple script. Requirements yaml PyQt6, ( PyQt5 ) Installation pip install Pya

Ar-Ray 7 Aug 15, 2022
Convert openmmlab (not only mmdetection) series model to tensorrt

MMDet to TensorRT This project aims to convert the mmdetection model to TensorRT model end2end. Focus on object detection for now. Mask support is exp

JinTian 4 Dec 17, 2021
This repo contains research materials released by members of the Google Brain team in Tokyo.

Brain Tokyo Workshop 🧠 🗼 This repo contains research materials released by members of the Google Brain team in Tokyo. Past Projects Weight Agnostic

Google 1.2k Jan 02, 2023
Code for 2021 NeurIPS --- Towards Multi-Grained Explainability for Graph Neural Networks

ReFine: Multi-Grained Explainability for GNNs This is the official code for Towards Multi-Grained Explainability for Graph Neural Networks (NeurIPS 20

Shirley (Ying-Xin) Wu 47 Dec 16, 2022