[Preprint] "Bag of Tricks for Training Deeper Graph Neural Networks A Comprehensive Benchmark Study" by Tianlong Chen*, Kaixiong Zhou*, Keyu Duan, Wenqing Zheng, Peihao Wang, Xia Hu, Zhangyang Wang

Overview

Bag of Tricks for Training Deeper Graph Neural Networks: A Comprehensive Benchmark Study

License: MIT

Codes for [Preprint] Bag of Tricks for Training Deeper Graph Neural Networks: A Comprehensive Benchmark Study

Tianlong Chen*, Kaixiong Zhou*, Keyu Duan, Wenqing Zheng, Peihao Wang, Xia Hu, Zhangyang Wang

Introduction

This is the first fair and reproducible benchmark dedicated to assessing the "tricks" of training deep GNNs. We categorize existing approaches, investigate their hyperparameter sensitivity, and unify the basic configuration. Comprehensive evaluations are then conducted on tens of representative graph datasets including the recent large-scale Open Graph Benchmark (OGB), with diverse deep GNN backbones. Based on synergistic studies, we discover the transferable combo of superior training tricks, that lead us to attain the new state-of-the-art results for deep GCNs, across multiple representative graph datasets.

Requirements

Installation with Conda

conda create -n deep_gcn_benchmark
conda activate deep_gcn_benchmark
pip install -r requirements.txt

Our Installation Notes for PyTorch Geometric.

What env configs that we tried that have succeeded: Mac/Linux + cuda driver 11.2 + Torch with cuda 11.1 + torch_geometric/torch sparse/etc with cuda 11.1.

What env configs that we tried but didn't work: Linux+Cuda 11.1/11.0/10.2 + whatever version of Torch.

In the above case when it did work, we adopted the following installation commands, and it automatically downloaded built wheels, and the installation completes within seconds.

In the case when it did not work, the installation appears to be very slow (ten minutes level for torch sparse/torch scatter). Then the installation did not produce any error, while when import torch_geometric in python code, it reports errors of different types.

Installation codes that we adopted on Linux cuda 11.2 that did work:

pip3 install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html
pip install torch-scatter -f https://pytorch-geometric.com/whl/torch-1.9.0+cu111.html
pip install torch-sparse -f https://pytorch-geometric.com/whl/torch-1.9.0+cu111.html
pip install torch-geometric

Project Structure

.
├── Dataloader.py
├── main.py
├── trainer.py
├── models
│   ├── *.py
├── options
│   ├── base_options.py
│   └── configs
│       ├── *.yml
├── tricks
│   ├── tricks
│   │   ├── dropouts.py
│   │   ├── norms.py
│   │   ├── others.py
│   │   └── skipConnections.py
│   └── tricks_comb.py
└── utils.py

How to Use the Benchmark

Train Deep GCN models as your baselines

To train a deep GCN model <model> on dataset <dataset> as your baseline, run:

python main.py --compare_model=1 --cuda_num=0 --type_model=<model> --dataset=<dataset>
# <model>   in  [APPNP, DAGNN, GAT, GCN, GCNII, GPRGNN, JKNet, SGC]
# <dataset> in  [Cora, Citeseer, Pubmed, ogbn-arixv, CoauthorCS, CoauthorPhysics, AmazonComputers, AmazonPhoto, TEXAS, WISCONSIN, CORNELL, ACTOR]

we comprehensively explored the optimal hyperparameters for all models we implemented and train the models under the well-studied hyperparameter settings. For model-specific hyperparameter configs, please refer to options/configs/*.yml

Explore different trick combinations

To explore different trick combinations, we provide a tricks_comb model, which integrates different types of tricks as follows:

dropouts:        DropEdge, DropNode, FastGCN, LADIES
norms:           BatchNorm, PairNorm, NodeNorm, MeanNorm, GroupNorm, CombNorm
skipConnections: Residual, Initial, Jumping, Dense
others:          IdentityMapping

To train a tricks_comb model with specific tricks, run:

python main.py --compare_model=0 --cuda_num=0 --type_trick=<trick_1>+<trick_2>+...+<trick_n> --dataset=<dataset>

, where you can assign type_trick with any number of tricks. For instance, to train a trick_comb model with Initial, EdgeDrop, BatchNorm and IdentityMapping on Cora, run:

python main.py --compare_model=0 --cuda_num=0 --type_trick=Initial+EdgeDrop+BatchNorm+IdentityMapping --dataset=Cora

We provide two backbones --type_model=GCN and --type_tricks=SGC for trick combinations. Specifically, when --type_model=SGC and --type_trick=IdenityMapping co-occur, IdentityMapping has higher priority.

How to Contribute

You are welcome to make any type of contributions. Here we provide a brief guidance to add your own deep GCN models and tricks.

Add your own model

Several simple steps to add your own deep GCN model <DeepGCN>.

  1. Create a python file named <DeepGCN>.py
  2. Implement your own model as a torch.nn.Module, where the class name is recommended to be consistent with your filename <DeepGCN>
  3. Make sure the commonly-used hyperparameters is consistent with ours (listed as follows). To create any new hyperparameter, add it in options/base_options.py.
 --dim_hidden        # hidden dimension
 --num_layers        # number of GCN layers
 --dropout           # rate of dropout for GCN layers
 --lr:               # learning rate
 --weight_decay      # rate of l2 regularization
  1. Register your model in models/__init__.py by add the following codes:
from <DeepGCN> import <DeepGCN>
__all__.append('<DeepGCN>')
  1. You are recommend to use YAML to store your dataset-specific hyperparameter configuration. Create a YAML file <DeepGCN>.yml in options/configs and add the hyperparameters as the following style:
<dataset_1>
  <hyperparameter_1> : value_1
  <hyperparameter_2> : value_2

Now your own model <DeepGCN> should be added successfully into our benchmark framework. To test the performance of <DeepGCN> on <dataset>, run:

python main.py --compare_model=1 --type_model=<DeepGCN> --dataset=<dataset>

Add your own trick

As all implemented tricks are coupled in tricks_comb.py tightly, we do not recommend integrating your own trick to trick_comb to avoid unexpected errors. However, you can use the interfaces we provided in tricks/tricks/ to combine your own trick with ours.

Main Results and Leaderboard

  • Superior performance of our best combo with 32 layers deep GCNs
Model Ranking on Cora Test Accuracy
Ours 85.48
GCNII 85.29
APPNP 83.68
DAGNN 83.39
GPRGNN 83.13
JKNet 73.23
SGC 68.45
Model Ranking on Citeseer Test Accuracy
Ours 73.35
GCNII 73.24
DAGNN 72.59
APPNP 72.13
GPRGNN 71.01
SGC 61.92
JKNet 50.68
Model Ranking on PubMed Test Accuracy
Ours 80.76
DAGNN 80.58
APPNP 80.24
GCNII 79.91
GPRGNN 78.46
SGC 66.61
JKNet 63.77
Model Ranking on OGBN-ArXiv Test Accuracy
Ours 72.70
GCNII 72.60
DAGNN 71.46
GPRGNN 70.18
APPNP 66.94
JKNet 66.31
SGC 34.22
  • Transferability of our best combo with 32 layers deep GCNs
Models Average Ranking on (CS, Physics, Computers, Photo, Texas, Wisconsin, Cornell, Actor)
Ours 1.500
SGC 6.250
DAGNN 4.375
GCNII 3.875
JKNet 4.875
APPNP 4.000
GPRGNN 3.125
  • Takeaways of the best combo

Citation

if you find this repo is helpful, please cite

TBD
Owner
VITA
Visual Informatics Group @ University of Texas at Austin
VITA
Graph Convolutional Networks for Temporal Action Localization (ICCV2019)

Graph Convolutional Networks for Temporal Action Localization This repo holds the codes and models for the PGCN framework presented on ICCV 2019 Graph

Runhao Zeng 318 Dec 06, 2022
Repository features UNet inspired architecture used for segmenting lungs on chest X-Ray images

Lung Segmentation (2D) Repository features UNet inspired architecture used for segmenting lungs on chest X-Ray images. Demo See the application of the

163 Sep 21, 2022
Accurate Phylogenetic Inference with Symmetry-Preserving Neural Networks

Accurate Phylogenetic Inference with a Symmetry-preserving Neural Network Model Claudia Solis-Lemus Shengwen Yang Leonardo Zepeda-Núñez This repositor

Leonardo Zepeda-Núñez 2 Feb 11, 2022
TensorFlow (v2.7.0) benchmark results on an M1 Macbook Air 2020 laptop (macOS Monterey v12.1).

M1-tensorflow-benchmark TensorFlow (v2.7.0) benchmark results on an M1 Macbook Air 2020 laptop (macOS Monterey v12.1). I was initially testing if Tens

particle 2 Jan 05, 2022
Collections for the lasted paper about multi-view clustering methods (papers, codes)

Multi-View Clustering Papers Collections for the lasted paper about multi-view clustering methods (papers, codes). There also exists some repositories

Andrew Guan 10 Sep 20, 2022
Autoregressive Models in PyTorch.

Autoregressive This repository contains all the necessary PyTorch code, tailored to my presentation, to train and generate data from WaveNet-like auto

Christoph Heindl 41 Oct 09, 2022
CNN designed for pansharpening

PROGRESSIVE BAND-SEPARATED CONVOLUTIONAL NEURAL NETWORK FOR MULTISPECTRAL PANSHARPENING This repository contains main code for the paper PROGRESSIVE B

SerendipitysX 3 Dec 29, 2021
A 3D Dense mapping backend library of SLAM based on taichi-Lang designed for the aerial swarm.

TaichiSLAM This project is a 3D Dense mapping backend library of SLAM based Taichi-Lang, designed for the aerial swarm. Intro Taichi is an efficient d

XuHao 230 Dec 19, 2022
A more easy-to-use implementation of KPConv based on PyTorch.

A more easy-to-use implementation of KPConv This repo contains a more easy-to-use implementation of KPConv based on PyTorch. Introduction KPConv is a

Zheng Qin 36 Dec 29, 2022
Implementation of self-attention mechanisms for general purpose. Focused on computer vision modules. Ongoing repository.

Self-attention building blocks for computer vision applications in PyTorch Implementation of self attention mechanisms for computer vision in PyTorch

AI Summer 962 Dec 23, 2022
Complementary Patch for Weakly Supervised Semantic Segmentation, ICCV21 (poster)

CPN (ICCV2021) This is an implementation of Complementary Patch for Weakly Supervised Semantic Segmentation, which is accepted by ICCV2021 poster. Thi

Ferenas 20 Dec 12, 2022
Official Pytorch Implementation for Splicing ViT Features for Semantic Appearance Transfer presenting Splice

Splicing ViT Features for Semantic Appearance Transfer [Project Page] Splice is a method for semantic appearance transfer, as described in Splicing Vi

Omer Bar Tal 253 Jan 06, 2023
MusicYOLO framework uses the object detection model, YOLOx, to locate notes in the spectrogram.

MusicYOLO MusicYOLO framework uses the object detection model, YOLOX, to locate notes in the spectrogram. Its performance on the ISMIR2014 dataset, MI

Xianke Wang 2 Aug 02, 2022
TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction.

TalkNet 2 [WIP] TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Predictio

Rishikesh (ऋषिकेश) 69 Dec 17, 2022
Learning Generative Models of Textured 3D Meshes from Real-World Images, ICCV 2021

Learning Generative Models of Textured 3D Meshes from Real-World Images This is the reference implementation of "Learning Generative Models of Texture

Dario Pavllo 115 Jan 07, 2023
Pre-training of Graph Augmented Transformers for Medication Recommendation

G-Bert Pre-training of Graph Augmented Transformers for Medication Recommendation Intro G-Bert combined the power of Graph Neural Networks and BERT (B

101 Dec 27, 2022
机器学习、深度学习、自然语言处理等人工智能基础知识总结。

说明 机器学习、深度学习、自然语言处理基础知识总结。 目前主要参考李航老师的《统计学习方法》一书,也有一些内容例如XGBoost、聚类、深度学习相关内容、NLP相关内容等是书中未提及的。

Peter 445 Dec 12, 2022
Conformer: Local Features Coupling Global Representations for Visual Recognition

Conformer: Local Features Coupling Global Representations for Visual Recognition (arxiv) This repository is built upon DeiT and timm Usage First, inst

Zhiliang Peng 378 Jan 08, 2023
Building Ellee — A GPT-3 and Computer Vision Powered Talking Robotic Teddy Bear With Human Level Conversation Intelligence

Using an object detection and facial recognition system built on MobileNetSSDV2 and Dlib and running on an NVIDIA Jetson Nano, a GPT-3 model, Google Speech Recognition, Amazon Polly and servo motors,

24 Oct 26, 2022
Implementation of the paper Scalable Intervention Target Estimation in Linear Models (NeurIPS 2021), and the code to generate simulation results.

Scalable Intervention Target Estimation in Linear Models Implementation of the paper Scalable Intervention Target Estimation in Linear Models (NeurIPS

0 Oct 25, 2021