7. Graph Residual Networks (GRNs)

stack \(K\) GNNs

\(\rightarrow\) but, not too much performance improvement…

( \(\because\) also propagate noisy information, from too many neighbors )


Thus, use skip connections to solve the problem!

\(\rightarrow\) called GRNs ( Graph Residual Networks )


7-1. Highway GCN

Highway Network + GNN

  • in each layer, input is multiplied by gating weights

    & summed with the output

  • \(\mathbf{T}\left(\mathbf{h}^{t}\right) =\sigma\left(\mathbf{W}^{t} \mathbf{h}^{t}+\mathbf{b}^{t}\right)\).
  • \(\mathbf{h}^{t+1} =\mathbf{h}^{t+1} \odot \mathbf{T}\left(\mathbf{h}^{t}\right)+\mathbf{h}^{t} \odot\left(1-\mathbf{T}\left(\mathbf{h}^{t}\right)\right)\).


Highway gates

  • ability to select from NEW & OLD hidden states
  • early hidden states can be propagetd to final state, if needed!


At most 4 layers ( not much difference afterwards.. )


7-2. Jump Knowledge Network

Limitations of neighborhood aggregation

  • different nodes in graphs, may need different receptive fields
    • ex) core nodes : need many neighbors
    • ex) node far from core : need less neighbors


Jump Knowledge Newtork

  • adaptive, structure-aware representations
  • selects from all of intermediate representations

  • able to select effective neighborhood size

  • can be cominbed with GCN, GraphSAGE, GAT..

figure2


7-3. DeepGCNs

stacking more layers ….problem?

  • (1) vanishing gradient
  • (2) over-smoothing


Solution :

  • for problem (1) : use residual connections & dense connections

  • for problem (2) : use dilated CNN


3 types of GCN

  • Plain GCN ( = Vanilla GCN )
  • ResGCN
  • DenseGCN


Plain GCN ( = Vanilla GCN )

  • \(\mathbf{H}^{t+1}=\mathcal{F}\left(\mathbf{H}^{t}, \mathbf{W}^{t}\right)\).


ResGCN

  • \(\mathbf{H}_{\text {Res }}^{t+1} =\mathbf{H}^{t+1}+\mathbf{H}^{t} =\mathcal{F}\left(\mathbf{H}^{t}, \mathbf{W}^{t}\right)+\mathbf{H}^{t}\).


DenseGCN

  • \(\begin{aligned} \mathbf{H}_{\text {Dense }}^{t+1} &=\mathcal{T}\left(\mathbf{H}^{t+1}, \mathbf{H}^{t}, \ldots, \mathbf{H}^{0}\right) \\ &=\mathcal{T}\left(\mathcal{F}\left(\mathbf{H}^{t}, \mathbf{W}^{t}\right), \mathcal{F}\left(\mathbf{H}^{t-1}, \mathbf{W}^{t-1}\right), \ldots, \mathbf{H}^{0}\right) \end{aligned}\).
    • \(\mathcal{T}\) : vertex wise concatenation


figure2


Dilated Convolutions

  • to solve over smoothing
  • this paper uses Dilated k-NN

  • leverages information from different context

    & able to enlarge receptive field

\(\rightarrow\) added to ResGCN, DenseGCN


figure2

Categories:

Updated: