[ 1. Introduction : ML for Graphs ]

( 참고 : CS224W: Machine Learning with Graphs )


Contents

  • 1-1. Why Graphs

  • 1-2. Applications of Graph ML

  • 1-3. Choice of Graph Representation


1-1. Why Graphs

1) Graphs

= general language for describing & analyzing entities with…

  • 1) “relations”
  • 2) “interactions”


Lots of data are “graphs”!

  • ex) computer network, event graphs,…

figure2


2) Types of networks & graphs

Networks ( = Natural Graphs )

  • 1) Social networks ( society )
  • 2) Communication & Transactions
  • 3) Biomedicine ( ex. genes )
  • 4) Brain connections ( ex. thoughts )


Graphs ( = representation )

  • 1) information/knowledge
  • 2) similarity networks
  • 3) relational structures


Relational Graphs

  • model better relationships between entities!

  • use DL for complex modeling


3) Networks are complex

Why is it complex?

  • arbitrary size & complex structure
  • no fixed node ordering or reference point
  • dynamic & multi-modal features


4) DL in graphs

[ Input ]

  • Network

[ Output ] (Prediction)

  • 1) node labels
  • 2) new links
  • 3) generated graphs / subgraphs


Key point : “Representation Learning”

  • “automatically” learn features with DNN

  • map to \(d\)-dim embeddings

    • similar nodes = similar in embedded space


figure2


5) Outline

  • 1) Traditional methods
    • Graphlets, Graph Kernels
  • 2) Node Embeddings
    • DeepWalk, Node2vec
  • 3) GNN
    • GCN, GraphSAGE, GAT
  • 4) Knowledge graph & reasoning
    • TransE, BetaE
  • 5) Deep Generative models ( for graphs )

  • 6) Applications


1-2. Applications of Graph ML

1) Diverse Tasks

  • 1) Node classification
    • ex) categorizing users/items
  • 2) Link prediction
    • ex) knowledge graph completion
  • 3) Graph Classification
    • ex) Molecule property prediction
  • 4) Clustering
    • ex) community detection
  • 5) etc
    • Graph generation, Graph evolution ..


Diverse level of tasks

figure2


2) Example of “NODE-level” ML

ex) Protein Folding

  • Protein
    • protein= sequence of amino acid
    • 3d structure
    • interact with each other
  • Goal : “predict 3D structure”, based on “amino acid sequence”
  • key idea of AlphaFold : “spatial graph”
    • (1) node : amino acids
    • (2) edges : proximity between nodes


figure2


3) Example of “EDGE-level” ML

ex) Recommender Systems

  • Formulation
    • (1) node : user & items
    • (2) edge : user & item interaction
  • Goal : “Recommend item to users”

    ( predict whether 2 nodes are related )


figure2


ex) Drug Side Effects

  • Background : many patients & many drugs
  • Goal : predict adverse side effects of “pair of drugs”

  • Formulation
    • (1) node : drugs & proteins
    • (2) edges : interactions
      • drug-protein interaction
      • protein-protein interaction
      • drug-drug interaction


figure2


4) Example of “SUBGRAPH-level” ML

ex) Traffic Prediction

  • want to move from A to B…. how long will it take?
  • Formulation
    • (1) node : road segments
    • (2) edges : connectivity between nodes
  • ex) Google Maps : traffic prediction with GNN


figure2


5) Examples of “GRAPH-level” ML

ex) Drug Discovery

  • Antibiotics = small molecular graphs
  • Formulation
    • (1) node : atoms
    • (2) edges : chemical bonds
  • (Q) Which molecules should be prioritized?
  • ex) graph classification model
    • predict promising molecules among candidates


ex) Graph generation

  • generate novel molecules ( new structure )
    • with “high drug likeness”
    • with “desirable properties”


figure2


ex) Physics Simulation

  • Formulation
    • (1) node : particles
    • (2) edges : interaction between nodes
  • Goal : predict how a graph will EVOLVE


1-3. Choice of Graph Representation

1) Components of Network

figure2

  • (1) Objects : nodes, vertices ( notation : \(N\) )

  • (2) Interactions : links, edges ( notation : \(E\) )

  • (3) System : network, graph ( notation : \(G(N,E)\) )


Notation :

  • \(\mid N \mid\) : number of nodes
  • \(\mid E \mid\) : number of edges


[ Question ]

Which one is proper representation??

How to build a graph?

  • 1) what are nodes?
  • 2) what are edges?


[ Answer ]

  • differ by domain/tasks!


2) Directed & Undirected Graphs

Directed

  • links = arcs
  • ex) phone calls, following in INSTAGRAM

Undirected

  • links = symmetrical, reciprocal
  • ex) collaborations, friendship in FB


figure2


3) Node Degree

Node degree ( \(k_i\) )

  • # of edges, adjacent to node \(i\)

Average degree

  • \(\bar{k}=\langle k\rangle=\frac{1}{N} \sum_{i=1}^{N} k_{i}=\frac{2 E}{N}\).


Directed Network

In-degree & Out-degree


4) Bipartite Graph

graph, whose nodes can be divided into “2 disjoint sets \(U\), \(V\)”, where …

every link connects a node in \(U\) to one in \(V\)

( = \(U\) & \(V\) : independent sets )

ex) User-Movie rating


figure2


Folded / Projected Bipartite Graphs

figure2


5) Adjacency Matrix

Different ways of representing graphs :

  • 1) adjacency matrix

  • 2) edge list & adjacency list


Adjacency Matrix

\(n \times n\) matrix ( \(n\) : number of nodes )

  • connected = 1
  • disconnected = 0

( Undirected Graph : symmetric adjacency matrix )

Problem : Adjacency matrices are SPARSE!


6) Edge list & Adjacency list

Different ways of representing graphs :

  • 1) adjacency matrix

  • 2) edge list & adjacency list


(a) Edge list

  • ex) (2,3), (2,4), (3,2), (3,4) ,… (5,2)


(b) Adjacency list

  • easier when network is large & sparse
  • ex)
    • 1 :
    • 2 : 3,4
    • 3 : 2,4
    • 4 : 5
    • 5 : 1,2


7) Node & Edge ATTRIBUTES

ex) weight, ranking, type, sign, …


8) More Types of Graphs

unweighted & weighted

  • unweighted ( 1 or 0 )
  • weighted ( not binary )

figure2


self-edges & multi-graph

  • self-edges ( self-loops )
  • multi-graph
    • multiple edges between pair of nodes

figure2


connected & disconnected graph

figure2

figure2


strongly & weakly connected graph

  • strongly : both A-B & B-A path
  • weakly : disregard edge direction


Strongly connected components (SCCs)

figure2


Tags: ,

Categories:

Updated: