Вы находитесь на странице: 1из 22

Graph Neural

Networks
Aakash Kumar
Arvind Ramadurai
Introduction
- A graph is a data structure which is a collection of vertices
and interconnections called edges.
- It models the relationship between the set of nodes.
- So, graph neural networks operate on this graph domain.
- It is used for various tasks like node classification, link
prediction, clustering, etc.
- Unlike standard neural networks, graph neural networks
retain a state that can represent information from its
neighborhood with arbitrary depth.
Why GNNs?
- The main motivation for GNNs came from convolutional neural
networks.
- The three main aspects of a CNN are local connection, shared weights
and the use of multiple layers.
- These can be very well used with graphs as
- Graphs are a perfect example of locally connected structure.
- Shared weights can be used to reduce computational cost.
- The use of multiple layers is the key to model hierarchical problems.

Therefore, it can be straight-foward to find out a generalization of CNNs to


graphs.
What is a node
embedding?
Node Embedding
- A node embedding simply means calculating a vector for
each node in the graph.
- The vector is calculated to have useful properties.
- For example, the dot product of any two nodes’
embeddings could tell you if they’re from the same class
or not.
- In this way, the embedding simplifies the graph data
down to something much more manageable: a vector.
How does a GNN
work?
Working of a GNN - Part 1
- The target of the GNN is to learn the node embedding hv which contains the
information of the neighbourhood of the node v.
- It is a s-dimensional vector that can be used to produce the output ov such as
the node label.
- Let f be a local transition function and g be a local output function.
Working of a GNN - Part 2
xv - features of node v
xco[v] - features of the edges of node v
hne[v] - states
xne[v] - features of the neighbouring nodes
Let H, O, X, XN be the stacking of all the states, all outputs, all the features and all
the node features, respectively.

Let F and G, the global transition function and global output function, be the
stacked versions of f and g.

Then we have a compact form:


Let us first look at the
Fixed Point Iteration
Method.
Fixed Point Iteration Method
A point, say s, is a fixed point if it satisfies the equation x = g(x).

Algorithm:

1. Given an equation f(x) = 0, convert it into the form x = g(x).


2. Let the initial guess be xo.
3. Do xi+1 = g(xi) while (none of the convergence criterion C1 or C2 is met).
C1: Fixing the number of iterations N
C2: By testing the condition: |xi+1 - g(xi)| < 𝝐
Working of GNN - Part 3
- The value of H is assumed to be a fixed point provided that F is a contraction
map.
- So, using the Banach Fixed Point Theorem, we can use the classic iterative
method to find out H.
Working of GNN - Part 4
- GNN can be considered as a semi-supervised learning algorithm, few of the
nodes will have a ground-truth tV .
- With the target information for supervision, the loss function can be written
as follows:

- Here p is the number of supervised nodes and it uses Gradient-Descent


strategy to minimize the loss function.
Graph
Convolutional
Neural Network
Graph Convolution Network
1. Graph Convolution Network can be followed as the architecture is relatively
similar to that of normal Neural Architecture
2. It takes Adjacency Matrix and feature of each of nodes as Input.
1. Input:

a. Adjacency matrix : N*N

B. feature vector N*F

Propagation Rule
Output for Karate Club Dataset

Trained by using Random Weight Initialisation with two layers


Limitation of GCN and Solutions
1. The aggregated representation of a node does not include its own features.
2. By adding Adjacency Matrix with Identity Matrix, the problem can be solved.
3. Nodes with large degrees will have large values in their feature
representation while nodes with small degrees will have small values. This
can cause vanishing and exploding gradients. Thus problematic for
stochastic descent algorithms during training.
4. The Above problem can be solved using multiplying with Inverse degree
matrix. In this, the value is divided by the degree of corresponding row.
Training GCN with Karate Dataset
Zero Shot Learning
1. Zero shot learning deals with identification of class, given that the model is
never trained with that class of data.
2. Can be solved if we can somehow find similarities between seen and unseen
classes.
3. The similarities is stored in Knowledge Graph.
4. Final Decision of class classification for unseen dataset based on knowledge
graph and visual information of seen data.
Seen data
Knowledge graph for
relationship between seen
and unseen classes

Unseen
Data
Research Gaps
- Is GNN better for clustering than tradition algorithms like K-means and
K-medoids?
- A majority of GNN architectures assume homogeneous graphs. It is difficult
to directly apply current GNNs to heterogeneous graphs, which may contain
different types of nodes and edges, or different forms of node and edge
inputs, such as images and text.
- The scalability of GNNs is gained at the price of corrupting graph
completeness. Whether using sampling or clustering, a model will lose part
of the graph information. By sampling, a node may miss its influential
neighbors.
Thank You.

Вам также может понравиться