7. Graph Residual Networks (GRNs)

stack \(K\) GNNs

\(\rightarrow\) but, not too much performance improvement…

( \(\because\) also propagate noisy information, from too many neighbors )

Thus, use skip connections to solve the problem!

\(\rightarrow\) called GRNs ( Graph Residual Networks )

7-1. Highway GCN

Highway Network + GNN

in each layer, input is multiplied by gating weights

& summed with the output
\(\mathbf{T}\left(\mathbf{h}^{t}\right) =\sigma\left(\mathbf{W}^{t} \mathbf{h}^{t}+\mathbf{b}^{t}\right)\).
\(\mathbf{h}^{t+1} =\mathbf{h}^{t+1} \odot \mathbf{T}\left(\mathbf{h}^{t}\right)+\mathbf{h}^{t} \odot\left(1-\mathbf{T}\left(\mathbf{h}^{t}\right)\right)\).

Highway gates

At most 4 layers ( not much difference afterwards.. )

Limitations of neighborhood aggregation

different nodes in graphs, may need different receptive fields
- ex) core nodes : need many neighbors
- ex) node far from core : need less neighbors

Jump Knowledge Newtork

stacking more layers ….problem?

Solution :

3 types of GCN

\(\mathbf{H}_{\text {Res }}^{t+1} =\mathbf{H}^{t+1}+\mathbf{H}^{t} =\mathcal{F}\left(\mathbf{H}^{t}, \mathbf{W}^{t}\right)+\mathbf{H}^{t}\).

\(\begin{aligned} \mathbf{H}_{\text {Dense }}^{t+1} &=\mathcal{T}\left(\mathbf{H}^{t+1}, \mathbf{H}^{t}, \ldots, \mathbf{H}^{0}\right) \\ &=\mathcal{T}\left(\mathcal{F}\left(\mathbf{H}^{t}, \mathbf{W}^{t}\right), \mathcal{F}\left(\mathbf{H}^{t-1}, \mathbf{W}^{t-1}\right), \ldots, \mathbf{H}^{0}\right) \end{aligned}\).
- \(\mathcal{T}\) : vertex wise concatenation

to solve over smoothing
this paper uses Dilated k-NN
leverages information from different context

& able to enlarge receptive field

\(\rightarrow\) added to ResGCN, DenseGCN