An implementation of "MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing" (ICML 2019).

Overview

MixHop and N-GCN

PWC Arxiv codebeat badge repo sizebenedekrozemberczki

A PyTorch implementation of "MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing" (ICML 2019) and "A Higher-Order Graph Convolutional Layer" (NeurIPS 2018).


Abstract

Recent methods generalize convolutional layers from Euclidean domains to graph-structured data by approximating the eigenbasis of the graph Laplacian. The computationally-efficient and broadly-used Graph ConvNet of Kipf & Welling, over-simplifies the approximation, effectively rendering graph convolution as a neighborhood-averaging operator. This simplification restricts the model from learning delta operators, the very premise of the graph Laplacian. In this work, we propose a new Graph Convolutional layer which mixes multiple powers of the adjacency matrix, allowing it to learn delta operators. Our layer exhibits the same memory footprint and computational complexity as a GCN. We illustrate the strength of our proposed layer on both synthetic graph datasets, and on several real-world citation graphs, setting the record state-of-the-art on Pubmed.

This repository provides a PyTorch implementation of MixHop and N-GCN as described in the papers:

MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing Sami Abu-El-Haija, Bryan Perozzi, Amol Kapoor, Hrayr Harutyunyan, Nazanin Alipourfard, Kristina Lerman, Greg Ver Steeg, and Aram Galstyan. ICML, 2019. [Paper]

A Higher-Order Graph Convolutional Layer. Sami A Abu-El-Haija, Bryan Perozzi, Amol Kapoor, Nazanin Alipourfard, Hrayr Harutyunyan. NeurIPS, 2018. [Paper]

The original TensorFlow implementation of MixHop is available [Here].

Requirements

The codebase is implemented in Python 3.5.2. package versions used for development are just below.

networkx          2.4
tqdm              4.28.1
numpy             1.15.4
pandas            0.23.4
texttable         1.5.0
scipy             1.1.0
argparse          1.1.0
torch             1.1.0
torch-sparse      0.3.0

Datasets

The code takes the **edge list** of the graph in a csv file. Every row indicates an edge between two nodes separated by a comma. The first row is a header. Nodes should be indexed starting with 0. A sample graph for `Cora` is included in the `input/` directory. In addition to the edgelist there is a JSON file with the sparse features and a csv with the target variable.

The **feature matrix** is a sparse binary one it is stored as a json. Nodes are keys of the json and feature indices are the values. For each node feature column ids are stored as elements of a list. The feature matrix is structured as:

{ 0: [0, 1, 38, 1968, 2000, 52727],
  1: [10000, 20, 3],
  2: [],
  ...
  n: [2018, 10000]}

The **target vector** is a csv with two columns and headers, the first contains the node identifiers the second the targets. This csv is sorted by node identifiers and the target column contains the class meberships indexed from zero.

NODE ID Target
0 3
1 1
2 0
3 1
... ...
n 3

Options

Training an N-GCN/MixHop model is handled by the `src/main.py` script which provides the following command line arguments.

Input and output options

  --edge-path       STR    Edge list csv.         Default is `input/cora_edges.csv`.
  --features-path   STR    Features json.         Default is `input/cora_features.json`.
  --target-path     STR    Target classes csv.    Default is `input/cora_target.csv`.

Model options

  --model             STR     Model variant.                 Default is `mixhop`.               
  --seed              INT     Random seed.                   Default is 42.
  --epochs            INT     Number of training epochs.     Default is 2000.
  --early-stopping    INT     Early stopping rounds.         Default is 10.
  --training-size     INT     Training set size.             Default is 1500.
  --validation-size   INT     Validation set size.           Default is 500.
  --learning-rate     FLOAT   Adam learning rate.            Default is 0.01.
  --dropout           FLOAT   Dropout rate value.            Default is 0.5.
  --lambd             FLOAT   Regularization coefficient.    Default is 0.0005.
  --layers-1          LST     Layer sizes (upstream).        Default is [200, 200, 200]. 
  --layers-2          LST     Layer sizes (bottom).          Default is [200, 200, 200].
  --cut-off           FLOAT   Norm cut-off for pruning.      Default is 0.1.
  --budget            INT     Architecture neuron budget.    Default is 60.

Examples

The following commands learn a neural network and score on the test set. Training a model on the default dataset.

$ python src/main.py

Training a MixHop model for a 100 epochs.

$ python src/main.py --epochs 100

Increasing the learning rate and the dropout.

$ python src/main.py --learning-rate 0.1 --dropout 0.9

Training a model with diffusion order 2:

$ python src/main.py --layers 64 64

Training an N-GCN model:

$ python src/main.py --model ngcn

License


Comments
  • FileNotFoundError: [Errno 2] No such file or directory: './input/cora_edges.csv'

    FileNotFoundError: [Errno 2] No such file or directory: './input/cora_edges.csv'

    hello, when i run src/main.py,the error message appears: File "pandas_libs\parsers.pyx", line 361, in pandas._libs.parsers.TextReader.cinit File "pandas_libs\parsers.pyx", line 653, in pandas._libs.parsers.TextReader._setup_parser_source FileNotFoundError: [Errno 2] No such file or directory: './input/cora_edges.csv'

    do you know how to solve it?

    opened by tanjia123456 4
  • Citeseer and Pubmed Datasets

    Citeseer and Pubmed Datasets

    Hi Benedek,

    Thank you so much for the code. I want to run your code on Citeseer and Pubmed datasets. Would you mind providing Citeseer and Pubmed data in this format? By the way, after running MixHop model with default parameters I got the test accuracy 0.7867. Did the accuracy depend on the system that the code is running?

    Thanks in advance

    opened by bousejin 3
  • IndexError:

    IndexError:

    hello, when I run main.py, the error message occurs:

    File "D:\anaconda3.4\lib\site-packages\torch_sparse\spmm.py", line 30, in spmm out = matrix[col] IndexError: index 10241 is out of bounds for dimension 0 with size 10241

    the content of the spmm.py: `# import torch from torch_scatter import scatter_add

    def spmm(index, value, m, n, matrix): """Matrix product of sparse matrix with dense matrix. 稀疏矩阵与稠密矩阵的矩阵乘积

    Args:
        index (:class:`LongTensor`): The index tensor of sparse matrix.
        value (:class:`Tensor`): The value tensor of sparse matrix.
        m (int): The first dimension of corresponding dense matrix.
        n (int): The second dimension of corresponding dense matrix.
        matrix (:class:`Tensor`): The dense matrix.
      :rtype: :class:`Tensor`
    """
    
    assert n == matrix.size(0)
    
    row, col = index
    
    matrix = matrix if matrix.dim() > 1 else matrix.unsqueeze(-1)
    
    out = matrix[col]
    out = out * value.unsqueeze(-1)
    out = scatter_add(out, row, dim=0, dim_size=m)
    
    return out
    

    ` by the way, I use my own datasets, and the number of node is 10242. do you know how to solve it?

    opened by tanjia123456 2
  • some problem about codes

    some problem about codes

    When I run the code, some error occured as follows: MixHop-and-N-GCN-master\src\utils.py", line 45, in feature_reader out_features["indices"] = torch.LongTensor(np.concatenate([features.row.reshape(-1,1), features.col.reshape(-1,1)],axis=1).T) TypeError: can't convert np.ndarray of type numpy.int32. The only supported types are: float64, float32, float16, int64, int32, int16, int8, and uint8. I search it on the Internet and found that it seems to be a list of lists that are not of the same length. I was stuck in it and do not know how to correct it! Looking forward to your help!!Thanks!!

    opened by junkangwu 2
  • About

    About "torch_scatter"

    I cannot successfully install "torch_scatter". When I run the command line : pip3 install torch_scatter, an error always occurs, just like below. I tried to solve the problem, but I don't find the correct method. Could you help me? Thanks a lot!

    The error: ... cpu/scatter.cpp:1:29: fatal error: torch/extension.h: No such file or directory compilation terminated. error: command 'x86_64-linux-gnu-gcc' failed with exit status 1


    Failed building wheel for torch-scatter Running setup.py clean for torch-scatter Failed to build torch-scatter Installing collected packages: torch-scatter Running setup.py install for torch-scatter ... error Complete output from command /usr/bin/python3 -u -c "import setuptools, tokenize;file='/tmp/pip-build-ijd9s63n/torch-scatter/setup.py';exec(compile(getattr(tokenize, 'open', open)(file).read().replace('\r\n', '\n'), file, 'exec'))" install --record /tmp/pip-4tkx7v_s-record/install-record.txt --single-version-externally-managed --compile: running install running build running build_py creating build creating build/lib.linux-x86_64-3.5 creating build/lib.linux-x86_64-3.5/torch_scatter copying torch_scatter/mul.py -> build/lib.linux-x86_64-3.5/torch_scatter copying torch_scatter/mean.py -> build/lib.linux-x86_64-3.5/torch_scatter copying torch_scatter/sub.py -> build/lib.linux-x86_64-3.5/torch_scatter copying torch_scatter/min.py -> build/lib.linux-x86_64-3.5/torch_scatter copying torch_scatter/std.py -> build/lib.linux-x86_64-3.5/torch_scatter copying torch_scatter/init.py -> build/lib.linux-x86_64-3.5/torch_scatter copying torch_scatter/max.py -> build/lib.linux-x86_64-3.5/torch_scatter copying torch_scatter/div.py -> build/lib.linux-x86_64-3.5/torch_scatter copying torch_scatter/add.py -> build/lib.linux-x86_64-3.5/torch_scatter creating build/lib.linux-x86_64-3.5/test copying test/test_multi_gpu.py -> build/lib.linux-x86_64-3.5/test copying test/utils.py -> build/lib.linux-x86_64-3.5/test copying test/test_std.py -> build/lib.linux-x86_64-3.5/test copying test/init.py -> build/lib.linux-x86_64-3.5/test copying test/test_forward.py -> build/lib.linux-x86_64-3.5/test copying test/test_backward.py -> build/lib.linux-x86_64-3.5/test creating build/lib.linux-x86_64-3.5/torch_scatter/utils copying torch_scatter/utils/ext.py -> build/lib.linux-x86_64-3.5/torch_scatter/utils copying torch_scatter/utils/init.py -> build/lib.linux-x86_64-3.5/torch_scatter/utils copying torch_scatter/utils/gen.py -> build/lib.linux-x86_64-3.5/torch_scatter/utils running build_ext building 'torch_scatter.scatter_cpu' extension creating build/temp.linux-x86_64-3.5 creating build/temp.linux-x86_64-3.5/cpu x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/zgh/.local/lib/python3.5/site-packages/torch/lib/include -I/home/zgh/.local/lib/python3.5/site-packages/torch/lib/include/TH -I/home/zgh/.local/lib/python3.5/site-packages/torch/lib/include/THC -I/usr/include/python3.5m -c cpu/scatter.cpp -o build/temp.linux-x86_64-3.5/cpu/scatter.o -Wno-unused-variable -DTORCH_EXTENSION_NAME=scatter_cpu -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11 cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ cpu/scatter.cpp:1:29: fatal error: torch/extension.h: No such file or directory compilation terminated. error: command 'x86_64-linux-gnu-gcc' failed with exit status 1

    ----------------------------------------
    

    Command "/usr/bin/python3 -u -c "import setuptools, tokenize;file='/tmp/pip-build-ijd9s63n/torch-scatter/setup.py';exec(compile(getattr(tokenize, 'open', open)(file).read().replace('\r\n', '\n'), file, 'exec'))" install --record /tmp/pip-4tkx7v_s-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-ijd9s63n/torch-scatter/ You are using pip version 8.1.1, however version 19.1.1 is available. You should consider upgrading via the 'pip install --upgrade pip' command.

    opened by zhangguanghui1 2
  • running error

    running error

    hello, when I run main.py, the error message occurs: File "/home/tj/anaconda3/lib/python3.6/site-packages/torch_sparse/init.py", line 22, in raise OSError(e) OSError: libcusparse.so.10.0: cannot open shared object file: No such file or directory

    my python version 3.6 cuda10.0 torch1.0.1 torch-sparse0.5.1 do you know how to solve it?

    opened by tanjia123456 1
  • Higher Powers Implementation

    Higher Powers Implementation

    image

    https://github.com/benedekrozemberczki/MixHop-and-N-GCN/blob/6e4ae00055fc1aecd972081ef9c152b0e9de37c1/src/layers.py#L54-L56

    To me it looks like this is implementing H(l+1) = σ(A^j * H(l) * W(l))

    image

    Can you explain where W_j and the concatenation are taking place?

    opened by datavistics 1
  • This paper was accepted in NeuralPS 2018?

    This paper was accepted in NeuralPS 2018?

    I didn't find this paper "A Higher-Order Graph Convolutional Layer" in NeuralPS 2018 accepted list. So I am not sure whether this paper has been accepted?

    opened by Jhy1993 1
Releases(v_00001)
Owner
Benedek Rozemberczki
Machine Learning Engineer at AstraZeneca | PhD from The University of Edinburgh.
Benedek Rozemberczki
This repository is for our EMNLP 2021 paper "Automated Generation of Accurate & Fluent Medical X-ray Reports"

Introduction: X-Ray Report Generation This repository is for our EMNLP 2021 paper "Automated Generation of Accurate & Fluent Medical X-ray Reports". O

no name 36 Dec 16, 2022
Deep learning models for classification of 15 common weeds in the southern U.S. cotton production systems.

CottonWeeds Deep learning models for classification of 15 common weeds in the southern U.S. cotton production systems. requirements pytorch torchsumma

Dong Chen 8 Jun 07, 2022
Using deep learning model to detect breast cancer.

Breast-Cancer-Detection Breast cancer is the most frequent cancer among women, with around one in every 19 women at risk. The number of cases of breas

1 Feb 13, 2022
[IEEE TPAMI21] MobileSal: Extremely Efficient RGB-D Salient Object Detection [PyTorch & Jittor]

MobileSal IEEE TPAMI 2021: MobileSal: Extremely Efficient RGB-D Salient Object Detection This repository contains full training & testing code, and pr

Yu-Huan Wu 52 Jan 06, 2023
Sequence-tagging using deep learning

Classification using Deep Learning Requirements PyTorch version = 1.9.1+cu111 Python version = 3.8.10 PyTorch-Lightning version = 1.4.9 Huggingface

Vineet Kumar 2 Dec 20, 2022
NIMA: Neural IMage Assessment

PyTorch NIMA: Neural IMage Assessment PyTorch implementation of Neural IMage Assessment by Hossein Talebi and Peyman Milanfar. You can learn more from

Kyryl Truskovskyi 293 Dec 30, 2022
Deep Learning ❤️ OneFlow

Deep Learning with OneFlow made easy 🚀 ! Carefree? carefree-learn aims to provide CAREFREE usages for both users and developers. User Side Computer V

21 Oct 27, 2022
Fast Scattering Transform with CuPy/PyTorch

Announcement 11/18 This package is no longer supported. We have now released kymatio: http://www.kymat.io/ , https://github.com/kymatio/kymatio which

Edouard Oyallon 289 Dec 07, 2022
Viewmaker Networks: Learning Views for Unsupervised Representation Learning

Viewmaker Networks: Learning Views for Unsupervised Representation Learning Alex Tamkin, Mike Wu, and Noah Goodman Paper link: https://arxiv.org/abs/2

Alex Tamkin 31 Dec 01, 2022
RRxIO - Robust Radar Visual/Thermal Inertial Odometry: Robust and accurate state estimation even in challenging visual conditions.

RRxIO - Robust Radar Visual/Thermal Inertial Odometry RRxIO offers robust and accurate state estimation even in challenging visual conditions. RRxIO c

Christopher Doer 64 Dec 29, 2022
This project is the official implementation of our accepted ICLR 2021 paper BiPointNet: Binary Neural Network for Point Clouds.

BiPointNet: Binary Neural Network for Point Clouds Created by Haotong Qin, Zhongang Cai, Mingyuan Zhang, Yifu Ding, Haiyu Zhao, Shuai Yi, Xianglong Li

Haotong Qin 59 Dec 17, 2022
Joint learning of images and text via maximization of mutual information

mutual_info_img_txt Joint learning of images and text via maximization of mutual information. This repository incorporates the algorithms presented in

Ruizhi Liao 10 Dec 22, 2022
Meta Self-learning for Multi-Source Domain Adaptation: A Benchmark

Meta Self-Learning for Multi-Source Domain Adaptation: A Benchmark Project | Arxiv | YouTube | | Abstract In recent years, deep learning-based methods

CVSM Group - email: <a href=[email protected]"> 188 Dec 12, 2022
Pytorch implementation of Implicit Behavior Cloning.

Implicit Behavior Cloning - PyTorch (wip) Pytorch implementation of Implicit Behavior Cloning. Install conda create -n ibc python=3.8 pip install -r r

Kevin Zakka 49 Dec 25, 2022
Implementation of Squeezenet in pytorch, pretrained models on Cifar 10 data to come

Pytorch Squeeznet Pytorch implementation of Squeezenet model as described in https://arxiv.org/abs/1602.07360 on cifar-10 Data. The definition of Sque

gaurav pathak 86 Oct 28, 2022
This is a simple backtesting framework to help you test your crypto currency trading. It includes a way to download and store historical crypto data and to execute a trading strategy.

You can use this simple crypto backtesting script to ensure your trading strategy is successful Minimal setup required and works well with static TP a

Andrei 154 Sep 12, 2022
A PyTorch implementation of "Capsule Graph Neural Network" (ICLR 2019).

CapsGNN ⠀⠀ A PyTorch implementation of Capsule Graph Neural Network (ICLR 2019). Abstract The high-quality node embeddings learned from the Graph Neur

Benedek Rozemberczki 1.2k Jan 02, 2023
Deepparse is a state-of-the-art library for parsing multinational street addresses using deep learning

Here is deepparse. Deepparse is a state-of-the-art library for parsing multinational street addresses using deep learning. Use deepparse to Use the pr

GRAAL/GRAIL 192 Dec 20, 2022
PyTorch implementation of Hierarchical Multi-label Text Classification: An Attention-based Recurrent Network

hierarchical-multi-label-text-classification-pytorch Hierarchical Multi-label Text Classification: An Attention-based Recurrent Network Approach This

Mingu Kang 17 Dec 13, 2022
Efficient training of deep recommenders on cloud.

HybridBackend Introduction HybridBackend is a training framework for deep recommenders which bridges the gap between evolving cloud infrastructure and

Alibaba 111 Dec 23, 2022