PyGCL: A PyTorch Library for Graph Contrastive Learning

Overview

logo

PyGCL is a PyTorch-based open-source Graph Contrastive Learning (GCL) library, which features modularized GCL components from published papers, standardized evaluation, and experiment management.

Made with Python PyPI version Documentation Status GitHub stars GitHub forks Total lines visitors


What is Graph Contrastive Learning?

Graph Contrastive Learning (GCL) establishes a new paradigm for learning graph representations without human annotations. A typical GCL algorithm firstly constructs multiple graph views via stochastic augmentation of the input and then learns representations by contrasting positive samples against negative ones.

👉 For a general introduction of GCL, please refer to our paper and blog. Also, this repo tracks newly published GCL papers.

Install

Prerequisites

PyGCL needs the following packages to be installed beforehand:

  • Python 3.8+
  • PyTorch 1.9+
  • PyTorch-Geometric 1.7
  • DGL 0.7+
  • Scikit-learn 0.24+
  • Numpy
  • tqdm
  • NetworkX

Installation via PyPI

To install PyGCL with pip, simply run:

pip install PyGCL

Then, you can import GCL from your current environment.

A note regarding DGL

Currently the DGL team maintains two versions, dgl for CPU support and dgl-cu*** for CUDA support. Since pip treats them as different packages, it is hard for PyGCL to check for the version requirement of dgl. We have removed such dependency checks for dgl in our setup configuration and require the users to install a proper version by themselves.

Package Overview

Our PyGCL implements four main components of graph contrastive learning algorithms:

  • Graph augmentation: transforms input graphs into congruent graph views.
  • Contrasting architectures and modes: generate positive and negative pairs according to node and graph embeddings.
  • Contrastive objectives: computes the likelihood score for positive and negative pairs.
  • Negative mining strategies: improves the negative sample set by considering the relative similarity (the hardness) of negative sample.

We also implement utilities for training models, evaluating model performance, and managing experiments.

Implementations and Examples

For a quick start, please check out the examples folder. We currently implemented the following methods:

  • DGI (P. Veličković et al., Deep Graph Infomax, ICLR, 2019) [Example1, Example2]
  • InfoGraph (F.-Y. Sun et al., InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization, ICLR, 2020) [Example]
  • MVGRL (K. Hassani et al., Contrastive Multi-View Representation Learning on Graphs, ICML, 2020) [Example1, Example2]
  • GRACE (Y. Zhu et al., Deep Graph Contrastive Representation Learning, [email protected], 2020) [Example]
  • GraphCL (Y. You et al., Graph Contrastive Learning with Augmentations, NeurIPS, 2020) [Example]
  • SupCon (P. Khosla et al., Supervised Contrastive Learning, NeurIPS, 2020) [Example]
  • HardMixing (Y. Kalantidis et al., Hard Negative Mixing for Contrastive Learning, NeurIPS, 2020)
  • DCL (C.-Y. Chuang et al., Debiased Contrastive Learning, NeurIPS, 2020)
  • HCL (J. Robinson et al., Contrastive Learning with Hard Negative Samples, ICLR, 2021)
  • Ring (M. Wu et al., Conditional Negative Sampling for Contrastive Learning of Visual Representations, ICLR, 2021)
  • Exemplar (N. Zhao et al., What Makes Instance Discrimination Good for Transfer Learning?, ICLR, 2021)
  • BGRL (S. Thakoor et al., Bootstrapped Representation Learning on Graphs, arXiv, 2021) [Example1, Example2]
  • G-BT (P. Bielak et al., Graph Barlow Twins: A Self-Supervised Representation Learning Framework for Graphs, arXiv, 2021) [Example]
  • VICReg (A. Bardes et al., VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning, arXiv, 2021)

Building Your Own GCL Algorithms

Besides try the above examples for node and graph classification tasks, you can also build your own graph contrastive learning algorithms straightforwardly.

Graph Augmentation

In GCL.augmentors, PyGCL provides the Augmentor base class, which offers a universal interface for graph augmentation functions. Specifically, PyGCL implements the following augmentation functions:

Augmentation Class name
Edge Adding (EA) EdgeAdding
Edge Removing (ER) EdgeRemoving
Feature Masking (FM) FeatureMasking
Feature Dropout (FD) FeatureDropout
Edge Attribute Masking (EAR) EdgeAttrMasking
Personalized PageRank (PPR) PPRDiffusion
Markov Diffusion Kernel (MDK) MarkovDiffusion
Node Dropping (ND) NodeDropping
Node Shuffling (NS) NodeShuffling
Subgraphs induced by Random Walks (RWS) RWSampling
Ego-net Sampling (ES) Identity

Call these augmentation functions by feeding with a Graph in a tuple form of node features, edge index, and edge features (x, edge_index, edge_attrs) will produce corresponding augmented graphs.

Composite Augmentations

PyGCL supports composing arbitrary numbers of augmentations together. To compose a list of augmentation instances augmentors, you need to use the Compose class:

import GCL.augmentors as A

aug = A.Compose([A.EdgeRemoving(pe=0.3), A.FeatureMasking(pf=0.3)])

You can also use the RandomChoice class to randomly draw a few augmentations each time:

import GCL.augmentors as A

aug = A.RandomChoice([A.RWSampling(num_seeds=1000, walk_length=10),
                      A.NodeDropping(pn=0.1),
                      A.FeatureMasking(pf=0.1),
                      A.EdgeRemoving(pe=0.1)],
                     num_choices=1)

Customizing Your Own Augmentation

You can write your own augmentation functions by inheriting the base Augmentor class and defining the augment function.

Contrasting Architectures and Modes

Existing GCL architectures could be grouped into two lines: negative-sample-based methods and negative-sample-free ones.

  • Negative-sample-based approaches can either have one single branch or two branches. In single-branch contrasting, we only need to construct one graph view and perform contrastive learning within this view. In dual-branch models, we generate two graph views and perform contrastive learning within and across views.
  • Negative-sample-free approaches eschew the need of explicit negative samples. Currently, PyGCL supports the bootstrap-style contrastive learning as well contrastive learning within embeddings (such as Barlow Twins and VICReg).
Contrastive architectures Supported contrastive modes Need negative samples Class name Examples
Single-branch contrasting G2L only SingleBranchContrast DGI, InfoGraph
Dual-branch contrasting L2L, G2G, and G2L DualBranchContrast GRACE
Bootstrapped contrasting L2L, G2G, and G2L BootstrapContrast BGRL
Within-embedding contrasting L2L and G2G WithinEmbedContrast GBT

Moreover, you can use add_extra_mask if you want to add positives or remove negatives. This function performs bitwise ADD to extra positive masks specified by extra_pos_mask and bitwise OR to extra negative masks specified by extra_neg_mask. It is helpful, for example, when you have supervision signals from labels and want to train the model in a semi-supervised manner.

Internally, PyGCL calls Sampler classes in GCL.models that receive embeddings and produce positive/negative masks. PyGCL implements three contrasting modes: (a) Local-Local (L2L), (b) Global-Global (G2G), and (c) Global-Local (G2L) modes. L2L and G2G modes contrast embeddings at the same scale and the latter G2L one performs cross-scale contrasting. To implement your own GCL model, you may also use these provided sampler models:

Contrastive modes Class name
Same-scale contrasting (L2L and G2G) SameScaleSampler
Cross-scale contrasting (G2L) CrossScaleSampler
  • For L2L and G2G, embedding pairs of the same node/graph in different views constitute positive pairs. You can refer to GRACE and GraphCL for examples.
  • For G2L, node-graph embedding pairs form positives. Note that for single-graph datasets, the G2L mode requires explicit negative sampling (otherwise no negatives for contrasting). You can refer to DGI for an example.
  • Some models (e.g., GRACE) add extra intra-view negative samples. You may manually call sampler.add_intraview_negs to enlarge the negative sample set.
  • Note that the bootstrapping latent model involves some special model design (asymmetric online/offline encoders and momentum weight updates). You may refer to BGRL for details.

Contrastive Objectives

In GCL.losses, PyGCL implements the following contrastive objectives:

Contrastive objectives Class name
InfoNCE loss InfoNCE
Jensen-Shannon Divergence (JSD) loss JSD
Triplet Margin (TM) loss Triplet
Bootstrapping Latent (BL) loss BootstrapLatent
Barlow Twins (BT) loss BarlowTwins
VICReg loss VICReg

All these objectives are able to contrast any arbitrary positive and negative pairs, except for Barlow Twins and VICReg losses that perform contrastive learning within embeddings. Moreover, for InfoNCE and Triplet losses, we further provide SP variants that computes contrastive objectives given only one positive pair per sample to speed up computation and avoid excessive memory consumption.

Negative Sampling Strategies

PyGCL further implements several negative sampling strategies:

Negative sampling strategies Class name
Subsampling GCL.models.SubSampler
Hard negative mixing GCL.models.HardMixing
Conditional negative sampling GCL.models.Ring
Debiased contrastive objective GCL.losses.DebiasedInfoNCE , GCL.losses.DebiasedJSD
Hardness-biased negative sampling GCL.losses.HardnessInfoNCE, GCL.losses.HardnessJSD

The former three models serve as an additional sampling step similar to existing Sampler ones and can be used in conjunction with any objectives. The last two objectives are only for InfoNCE and JSD losses.

Utilities

PyGCL provides a variety of evaluator functions to evaluate the embedding quality:

Evaluator Class name
Logistic regression LREvaluator
Support vector machine SVMEvaluator
Random forest RFEvaluator

To use these evaluators, you first need to generate dataset splits by get_split (random split) or by from_predefined_split (according to preset splits).

Contribution

Feel free to open an issue should you find anything unexpected or create pull requests to add your own work! We are motivated to continuously make PyGCL even better.

Citation

Please cite our paper if you use this code in your own work:

@article{Zhu:2021tu,
author = {Zhu, Yanqiao and Xu, Yichen and Liu, Qiang and Wu, Shu},
title = {{An Empirical Study of Graph Contrastive Learning}},
journal = {arXiv.org},
year = {2021},
eprint = {2109.01116v1},
eprinttype = {arxiv},
eprintclass = {cs.LG},
month = sep,
}
Owner
PyGCL
A PyTorch Library for Graph Contrastive Learning
PyGCL
Automatically align face images 🙃→🙂. Can also do windowing and warping.

Automatic Face Alignment (AFA) Carl M. Gaspar & Oliver G.B. Garrod You have lots of photos of faces like this: But you want to line up all of the face

Carl Michael Gaspar 15 Dec 12, 2022
Implementation of CVPR'21: RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction

RfD-Net [Project Page] [Paper] [Video] RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction Yinyu Nie, Ji Hou, Xiaoguang Han, Matthi

Yinyu Nie 162 Jan 06, 2023
Code for the CVPR2021 paper "Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition"

Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition This repository contains code for the CVPR2021 paper "Patch-NetV

QVPR 368 Jan 06, 2023
[CVPR 2021] Pytorch implementation of Hijack-GAN: Unintended-Use of Pretrained, Black-Box GANs

Hijack-GAN: Unintended-Use of Pretrained, Black-Box GANs In this work, we propose a framework HijackGAN, which enables non-linear latent space travers

Hui-Po Wang 46 Sep 05, 2022
Pytorch implementation of the DeepDream computer vision algorithm

deep-dream-in-pytorch Pytorch (https://github.com/pytorch/pytorch) implementation of the deep dream (https://en.wikipedia.org/wiki/DeepDream) computer

102 Dec 05, 2022
A Fast and Accurate One-Stage Approach to Visual Grounding, ICCV 2019 (Oral)

One-Stage Visual Grounding ***** New: Our recent work on One-stage VG is available at ReSC.***** A Fast and Accurate One-Stage Approach to Visual Grou

Zhengyuan Yang 118 Dec 05, 2022
Code for ACM MM 2020 paper "NOH-NMS: Improving Pedestrian Detection by Nearby Objects Hallucination"

NOH-NMS: Improving Pedestrian Detection by Nearby Objects Hallucination The offical implementation for the "NOH-NMS: Improving Pedestrian Detection by

Tencent YouTu Research 64 Nov 11, 2022
Simulation environments for the CrazyFlie quadrotor: Used for Reinforcement Learning and Sim-to-Real Transfer

Phoenix-Drone-Simulation An OpenAI Gym environment based on PyBullet for learning to control the CrazyFlie quadrotor: Can be used for Reinforcement Le

Sven Gronauer 8 Dec 07, 2022
Official repository of the paper Learning to Regress 3D Face Shape and Expression from an Image without 3D Supervision

Official repository of the paper Learning to Regress 3D Face Shape and Expression from an Image without 3D Supervision

Soubhik Sanyal 689 Dec 25, 2022
Boost learning for GNNs from the graph structure under challenging heterophily settings. (NeurIPS'20)

Beyond Homophily in Graph Neural Networks: Current Limitations and Effective Designs Jiong Zhu, Yujun Yan, Lingxiao Zhao, Mark Heimann, Leman Akoglu,

GEMS Lab: Graph Exploration & Mining at Scale, University of Michigan 70 Dec 18, 2022
Yolov5 deepsort inference,使用YOLOv5+Deepsort实现车辆行人追踪和计数,代码封装成一个Detector类,更容易嵌入到自己的项目中

使用YOLOv5+Deepsort实现车辆行人追踪和计数,代码封装成一个Detector类,更容易嵌入到自己的项目中。

813 Dec 31, 2022
CSKG is a commonsense knowledge graph that combines seven popular sources into a consolidated representation

CSKG: The CommonSense Knowledge Graph CSKG is a commonsense knowledge graph that combines seven popular sources into a consolidated representation: AT

USC ISI I2 85 Dec 12, 2022
Neural-Pull: Learning Signed Distance Functions from Point Clouds by Learning to Pull Space onto Surfaces(ICML 2021)

Neural-Pull: Learning Signed Distance Functions from Point Clouds by Learning to Pull Space onto Surfaces(ICML 2021) This repository contains the code

149 Dec 15, 2022
Causal Imitative Model for Autonomous Driving

Causal Imitative Model for Autonomous Driving Mohammad Reza Samsami, Mohammadhossein Bahari, Saber Salehkaleybar, Alexandre Alahi. arXiv 2021. [Projec

VITA lab at EPFL 8 Oct 04, 2022
Python framework for Stochastic Differential Equations modeling

SDElearn: a Python package for SDE modeling This package implements functionalities for working with Stochastic Differential Equations models (SDEs fo

4 May 10, 2022
We will see a basic program that is basically a hint to brute force attack to crack passwords. In other words, we will make a program to Crack Any Password Using Python. Show some ❤️ by starring this repository!

Crack Any Password Using Python We will see a basic program that is basically a hint to brute force attack to crack passwords. In other words, we will

Ananya Chatterjee 11 Dec 03, 2022
A set of Deep Reinforcement Learning Agents implemented in Tensorflow.

Deep Reinforcement Learning Agents This repository contains a collection of reinforcement learning algorithms written in Tensorflow. The ipython noteb

Arthur Juliani 2.2k Jan 01, 2023
An Evaluation of Generative Adversarial Networks for Collaborative Filtering.

An Evaluation of Generative Adversarial Networks for Collaborative Filtering. This repository was developed by Fernando B. Pérez Maurera. Fernando is

Fernando Benjamín PÉREZ MAURERA 0 Jan 19, 2022
190 Jan 03, 2023
SegNet including indices pooling for Semantic Segmentation with tensorflow and keras

SegNet SegNet is a model of semantic segmentation based on Fully Comvolutional Network. This repository contains the implementation of learning and te

Yuta Kamikawa 172 Dec 23, 2022