[ 15. Deep Generative Models for Graphs ]

( 참고 : CS224W: Machine Learning with Graphs )

ML for Graph Generation
GraphRNN : Generating Realistic Graphs
BFS (Breadth-First Search node ordering)
Evaluating Generated Graphs
Application of Deep Graph Generative Models

Deep graph encoders & DECODERS

1. ML for Graph Generation

(1) Tasks

[ Task 1 ] Realistic graph generation

generate graph, that are “similar to a given set of graphs”

[ Task 2 ] Goal-directed graph generation

for specific objective! ( with constraints )

(2) Graph Generative Models

Given : graphs \(\{x_i\}\) ( sampled from \(p_{data}(G)\) )

we do not have true distribution, but only some samples of it

Goal

goal 1) learn \(p_{model}(G)\)
goal 2) sample from \(p_{model}(G)\)

(3) Basics for generative models

Notation

\(p_{data}(x)\) : data distribution
- unknown, but have some samples of it ( \(x_i \sim p_{data}(x)\) )
\(p_{model}(x ; \theta)\) : model
- use this to approximate \(p_{data}(x)\)

Goal

Density estimation
- make \(p_{model}(x ; \theta)\) close to \(p_{data}(x)\)
Sampling
- \[x_i \sim p_{model}(x ; \theta)\]

a) Density estimation

key point : MLE

\(\boldsymbol{\theta}^{*}=\underset{\boldsymbol{\theta}}{\arg \max } \mathbb{E}_{x \sim p_{\text {data }}} \log p_{\text {model }}(\boldsymbol{x} \mid \boldsymbol{\theta})\).
finding a model, that is most likely to have generated the observed \(x\)

b) Sampling

goal : sample from a “complex distribution”

most common approach

step 1) sample from noise : \(\boldsymbol{z}_{i} \sim N(0,1)\)
step 2) transform the noise : \(x_{i}=f\left(z_{i} ; \theta\right)\)

\(\rightarrow\) key point : use \(f(\cdot)\) with DNN

(4) Deep Generative Models

Auto-regressive models

use “chain rule”

\(p_{\text {model }}(\boldsymbol{x} ; \theta)=\prod_{t=1}^{n} p_{\text {model }}\left(x_{t} \mid x_{1}, \ldots, x_{t-1} ; \theta\right)\).
(for GNN) \(x_t\) : \(t\)-th action
- ex) adding NODE, adding EDGE

2. GraphRNN : Generating Realistic Graphs

(1) Basic Idea

generate graphs, by sequentially adding nodes/edges

(2) Model graphs as “SEQUENCES”

Notation

\(G\) : graph
\(\pi\) : node ordering
\(S^{\pi}\) : Sequence

Sequence \(S^{\pi}\) : consists of “2-level”

1) node-level : add one node
2) edge-level : add edge between existing nodes

each “node-level” step is an “edge-level sequence”

node-level (a)
- edge-level (a-1)
- edge-level (a-2)
- ..
node-level (b)
- edge-level (b-1)
- edge-level (b-2)
- ..

Ex) node-level

Ex) edge-level

Summary : graph + node ordering = sequence of sequence

\(\therefore\) graph generation = sequence generation
( node ordering : select randomly )

Process : need to model 2 processes

1) node-level sequence
- generate a “state for a new node”
2) edge-level sequence
- generate edges for “new node”, based on its state
\(\rightarrow\) use RNN for these 2 processes

(3) Graph RNN : 2-level RNN

a) 2-level RNN

NODE-level RNN ( nRNN )
EDGE-level RNN ( eRNN )

Relation between 2 RNNs :

“nRNN” generates the initial state for “eRNN”
“eRNN” sequentially predicts if the new node will connect to each of the previous node

b) key questions

Q1) how does RNN generate sequences?

\(\rightarrow\) let \(x_{t+1} = y_t\) ( input = previous output )

Q2) how to initialize input sequence

\(\rightarrow\) uses SOS token ( zero/one vector )

Q3) when to stop generation?

\(\rightarrow\) use EOS token as extra RNN output

c) deterministic & stochastic

Deterministic output

Stochastic output

Autoregressive Model ( with RNN )

\(\prod_{k=1}^{n} p_{\operatorname{model}}\left(x_{t} \mid x_{1}, \ldots, x_{t-1} ; \theta\right)\).
notation : \(y_{t}=p_{\text {model }}\left(x_{t} \mid x_{1}, \ldots, x_{t-1} ; \theta\right)\)
sampling : \(y_{t}: x_{t+1} \sim y_{t}\)
how to make it stochastic?
- output of RNN = “probability of an edge”
- (1) sample from that distribution & (2) feed to next step

d) RNN at Test Time

Notation

\(y_t\) : output of RNN at time \(t\)
- follows “Bernoulli distn”
\(p\) : probability of an edge
- (value=1 of prob \(p\)) & (value=0 of prob \(1-p\))

e) RNN at Training Time

observed data (given)

sequence \(y^{*}\) of edges ( 1 or 0 )

use “Teacher forcing”

use the “TRUE value”, while training
loss function : \(L=-\left[y_{1}^{*} \log \left(y_{1}\right)+\left(1-y_{1}^{*}\right) \log \left(1-y_{1}\right)\right]\)

f) Summary

step 1) add a new node
- run Node RNN
  
  ( output of Node RNN = initialize Edge RNN )
step 2) add new edges ( for new node )
- run Edge RNN
  
  ( predict if “new node” connect to “previous node” )
step 3) add a new node
- run Node RNN
  
  ( use last hidden state of Edge RNN )
step 4) stop generation
- stop, if Edge RNN outputs EOS at step 1

Test time

( just replace “input” at each step, with “GraphRNN’s own predictions” )

3. BFS (Breadth-First Search node ordering)

How to get tractability?

too many steps are need for edge generation

problem 1) need “FULL” adjacency matrix
problem 2) complex “TOO-LONG” edge dependencies

\(\rightarrow\) Solution : BFS (Breadth-First Search node ordering)

BFS (Breadth-First Search node ordering)

if node \(T\) doesn’t connect to node \(K\) ( \(K<T\) ),

then node \(T+\alpha\) doesn’t ~

( neighbors of \(K\) have already been traversed! )

\(\rightarrow\) only need memory of “2 steps” ( not n-1 steps )

With & Without BFS

4. Evaluating Generated Graphs

compare 2 sets of graphs

\(\rightarrow\) how much are they similar? need to define “SIMILARITY METRIC”

Solution

1) visual similarity
2) graph statistics similarity

(1) Visual similarity

(2) Graph statistics similarity

Typical graph statistics

1) degree distribution
2) clustering coefficient distribution
3) orbit count statistics

\(\rightarrow\) each statistic is “not a scalar”, but “PROBABILITY DISTN”

Solution

step 1) compare “2 graph statistics”
- by EMD (Earth Mover Distance)
step 2) compare “sets of graph statistics”
- by MMD (Maximum Mean Discrepancy)

a) EMD

similarity between “2 distns”
“minimum effort” that move earth from “one pile to another pile”

b) MMD

similarity between “2 sets”,

based on the similarity between “set elements”

we will compare “2 sets of graph statistics”

( = “2 sets of distributions” )

Summary

step 1) compute “graph statistics” of all graphs
step 2) calculate “MMD” ( distance between sets )

5. Application of Deep Graph Generative Models

pass

Twitter Facebook LinkedIn

(CS224W) 15.Deep Generative Models for Graphs

Seunghan Lee

[ 15. Deep Generative Models for Graphs ]

Contents

1. ML for Graph Generation

(1) Tasks

(2) Graph Generative Models

(3) Basics for generative models

a) Density estimation

b) Sampling

(4) Deep Generative Models

Auto-regressive models

2. GraphRNN : Generating Realistic Graphs

(1) Basic Idea

(2) Model graphs as “SEQUENCES”

(3) Graph RNN : 2-level RNN

a) 2-level RNN

b) key questions

c) deterministic & stochastic

d) RNN at Test Time

e) RNN at Training Time

f) Summary

Test time

3. BFS (Breadth-First Search node ordering)

BFS (Breadth-First Search node ordering)

4. Evaluating Generated Graphs

(1) Visual similarity

(2) Graph statistics similarity

a) EMD

b) MMD

Summary

5. Application of Deep Graph Generative Models

You May Also Enjoy