论文信息

论文标题:DropEdge: Towards Deep Graph Convolutional Networks on Node Classification
论文作者:Yu Rong, Wenbing Huang, Tingyang Xu, Junzhou Huang
论文来源:2020, ICLR
论文地址:download 
论文代码:download

1 Introduction

  由于 2022 I can't read the papers of the year,找了一篇 2020 essay to ease the mood,我太难了.

  Propose a method that can alleviate overfitting、Oversmoothing strategy,并且和 其他 backbone Model combinations will get better performance.

  Verify that over-smoothing is prone to occur on the small image:参见 Figure 1 Cora 数据集上使用 8 层 GCN 的结果.

  

  DropEdge 主要思想是:在每次训练时,Randomly remove fixed-scale edges from the original graph.

  在GCNapplied during trainingDropEdge有许多好处:

  1. DropEdge It can be seen as data augmentation technology.Different random deletions of edges in the original graph are performed during training,It also enhances the randomness and diversity of the input data,It can alleviate the problem of overfitting.
  2. DropEdge It can also be seen as a message passing reducer.GCNs中,Message passing between adjacent nodes is achieved by connecting edges,Randomly removing some edges can make node connections more sparse,在一定程度上避免了GCNOver-smoothing problem caused by layer deepening.

2 Preliminary

GCN

  The forward propagation layer is :

    $\boldsymbol{H}^{(l+1)}=\sigma\left(\hat{\boldsymbol{A}} \boldsymbol{H}^{(l)} \boldsymbol{W}^{(l)}\right)\quad\quad\quad(1)$

  其中,$\hat{\boldsymbol{A}}=\hat{\boldsymbol{D}}^{-1 / 2}(\boldsymbol{A}+\boldsymbol{I}) \hat{\boldsymbol{D}}^{-1 / 2}$,$\boldsymbol{W}^{(l)} \in \mathbb{R}^{C_{l} \times C_{l-1}}$.

3 Method

3.1 Methodlogy

  在每个训练 epoch,DropEdge The technique randomly deletes certain edges of the input graph.形式上,It randomly enforces the adjacency matrix $A$ 的 $V_p$ Non-zero elements are zero,其中 $V$ is the total number of edges,$p$ 是丢弃率.If we denote the resulting adjacency matrix as $A_{drop}$,那么它与 $A$ 的关系就变成了

    $A_{\mathrm{drop}}=A-A^{\prime}\quad\quad\quad(2)$

  其中 $\boldsymbol{A}^{\prime}$ is the set of edges removed from the original graph,然后对 $\boldsymbol{A}_{\text {drop }}$ 进行 re-normalization 得到 $\hat{\mathbf{A}}_{\text {drop }}$ ,替换 $\text{Eq.1}$ 中的 $\hat{\mathbf{A}}$.

Preventing over-fitting

  DropEdge A perturbation is introduced to the connections in the graph,It produces different random deformations to the input data,It can be seen as data augmentation.

  GCNs The core idea of ​​​​is to perform a weighted summation of the neighbor features of each node,Implement the aggregation of neighbor information.那么 DropEdge 可以看成在 GNN A random subset of neighbors is used for aggregation during training,without using all neighbors.若 DropEdge 删边率为 $p$,对邻居聚合的期望是由 $p$ 改变的,在对权重进行归一化后就不会再使用 $p$.

Layer-Wise DropEdge

  The above are for each epoch ,GNN Each layer shares one $\boldsymbol{A}_{\text {drop }}$ 但每层也可以单独进行 DropEdge,为数据带来更多的随机性.

  Note:同样,Similarly, it can be calculated separately for each layer KNN graph.

  下文将阐述 DropEdge How to alleviate the over-smoothing problem,And it is assumed that all layers used will share one $\boldsymbol{A}_{\text {drop }}$.

3.2 Preventing over-smoothing

  Oversmooth the original definition:The smoothing phenomenon means that as the depth of the network increases,Node features will converge to a fixed point.This unnecessary convergence limits depthGCNsThe output of is only relevant to the topology of the graph,But it has nothing to do with the input node features,这会损害 GCNs 的表达能力.

  By considering the idea of ​​nonlinear and convolutional filters,Oversmoothing can be interpreted as convergence to a subspace,instead of converging to a fixed point,This article will use the subspace concept to be more general.

  First, the following definitions are given:

  

  根据 Oono & Suzuki 的结论,deep enoughGCN在一些条件下,for arbitrarily small $\epsilon$ 值,都会有 $ \epsilon-smoothing$ 问题.They just come up with depth GCN 中存在 $\epsilon-smoothing$,But no corresponding solution has been proposed.

    • 降低节点之间的连接,可以降低过平滑的收敛速度;
    • 原始空间和子空间的维度之差衡量了信息的损失量;

  即:

  

4 Discussions

DropEdge vs. Dropout

  Dropout Attempts to perturb the feature matrix by randomly setting the feature dimension to zero,May reduce the effect of overfitting,But it doesn't help to prevent over-smoothing,Because it doesn't make any changes to the adjacency matrix;
  DropEdge 可以看成 Dropout Generalization to graph data,将删除特征换成删除边,两者是互补关系;

DropEdge vs DropNode

  DropNode Sample subgraphs for mini-batch training,Can be thought of as a specific form of edge deletion,Because the edge connected to the deleted node is also deleted.然而,DropNode The effect on removing edges is node-oriented and indirect.

  DropEdge is edge-oriented,And all the node features of training can be preserved,Show more flexibility.

  当前的 DropNode The sampling strategy in the method is usually inefficient,例如,GraphSAGE The layer size grows exponentially,而 AS-GCN Sampling needs to be performed recursively layer by layer.然而,DropEdge Neither increases the size of the layer with depth,Also does not require recursive processes,Because all edges are sampled in parallel.

DropEdge vs Graph-Sparsification

  图稀疏化(1997) The optimization goal is to remove unnecessary edges for graph compression,While preserving almost all information of the input graph.这和 DropEdge 的目的一样,但不同的是 DropEdge No specific optimization goals are required,Graph sparsification, on the other hand, employs a tedious optimization method to determine which edges to remove,Once these edges are discarded,The output graph will remain unchanged.

5 Experiment

数据集

  

Backbones

  

节点分类:(监督学习)

  

验证损失

  

标准化/传播模型

  

6 Conclusion

  DropEdge More diversity is included in the input data,以防止过拟合,And reduce message passing in graph convolution,to relieve over-smoothing.

论文解读(DropEdge)《DropEdge: Towards Deep Graph Convolutional Networks on Node Classification》的更多相关文章

  1. 论文解读 - Composition Based Multi Relational Graph Convolutional Networks

    1 简介 随着图卷积神经网络在近年来的不断发展,其对于图结构数据的建模能力愈发强大.然而现阶段的工作大多针对简单无向图或者异质图的表示学习,对图中边存在方向和类型的特殊图----多关系图(Multi- ...

  2. 论文解读(Geom-GCN)《Geom-GCN: Geometric Graph Convolutional Networks》

    Paper Information Title:Geom-GCN: Geometric Graph Convolutional NetworksAuthors:Hongbin Pei, Bingzhe ...

  3. 论文解读第三代GCN《 Deep Embedding for CUnsupervisedlustering Analysis》

    Paper Information Titlel:<Semi-Supervised Classification with Graph Convolutional Networks>Aut ...

  4. How to do Deep Learning on Graphs with Graph Convolutional Networks

    翻译: How to do Deep Learning on Graphs with Graph Convolutional Networks 什么是图卷积网络 图卷积网络是一个在图上进行操作的神经网 ...

  5. 【论文笔记】Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition

    Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition 2018-01-28  15:4 ...

  6. 论文笔记之:Semi-supervised Classification with Graph Convolutional Networks

    Semi-supervised Classification with Graph Convolutional Networks 2018-01-16  22:33:36 1. 文章主要思想: 2. ...

  7. Semi-Supervised Classification with Graph Convolutional Networks

    Kipf, Thomas N., and Max Welling. "Semi-supervised classification with graph convolutional netw ...

  8. Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition (ST-GCN)

    Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition 摘要 动态人体骨架模型带有进行动 ...

  9. Emotion Recognition Using Graph Convolutional Networks

    Emotion Recognition Using Graph Convolutional Networks 2019-10-22 09:26:56 This blog is from: https: ...

  10. 论文翻译——Character-level Convolutional Networks for Text Classification

    论文地址 Abstract Open-text semantic parsers are designed to interpret any statement in natural language ...

随机推荐

  1. 二模12day2解题报告

    T1.笨笨玩糖果(sugar) 有n颗糖,两个人轮流取质数颗糖,先取不了的(0或1)为输,求先手能否必胜,能,输出最少几步肯定能赢:不能,输出-1. 一开始天真的写了一个dp,f[i]表示i颗糖最少取 ...

  2. 工作流调度引擎---Oozie

    Oozie使用教程 一.   Oozie简介 Apache Oozie是用于Hadoop平台的一种工作流调度引擎. 作用 - 统一调度hadoop系统中常见的mr任务启动hdfs操作.shell调度. ...

  3. sqoop的安装

    Sqoopis a used to completeHadoop和关系型数据库中的数据相互转移的工具, He can use relational databases(MySql,Oracle,Postgres等)中的数据导入Hadoop的HDFS中, 也可以将HDFS ...

  4. [js]js设计模式小结

    js设计模式小结 工厂模式/构造函数--减少重复 - 创建对象有new - 自动创建obj,this赋值 - 无return 原型链模式 - 进一步去重 类是函数数据类型,每个函数都有prototyp ...

  5. LabVIEW(十六):多列列表框控件

    1.多列列表框控件:前面板右键>列表.表格和树>多列列表框2.默认情况下只显示列首,可设置显示行首:前面板选中多列列表框右键>显示>行首3.LabVIEW中提供42种自带的图标 ...

  6. DataTable 指定位置添加列

    dt.Columns.Add("id").SetOrdinal(指定位置);

  7. phpStudy模式下安装ssl证书,详细版

    phpStudy模式下安装ssl证书,详细版 2017年12月16日 14:27:38 骑着蚂蚁追大象 阅读数:4232 标签: phpstudy安装ssl证书 更多 个人分类: php   版权声明 ...

  8. 寒假MOOC学习计划

    我选择的是西北工业大学的课程,理由如下: 首先,选择这门课的网友还蛮多的,特意看了一下评价,也不错: 其次,这个课程的排版与我从图书馆借来的一本书内容排版比较符合,可以结合起来一起看,说不定会有更多收 ...

  9. android基础篇------------java基础(11)(文件解析xml and Json )

    一:xml文件解析 首先看一下:我们要解析的内容: <?xml version="1.0" encoding="gbk" ?> - <book ...

  10. 保证serviceA way not to be killed

    Service设置成START_STICKY kill 后会被重启(等待5秒左右),重传Intent,保持与重启前一样 提升service优先级 在AndroidManifest.xml文件中对于in ...