Graph basics

Overview

a network is a collection of vertices joined by edges
vertices and edges are also called nodes and links in computer science and actors and ties in sociology
in mathematics, a network is called graph and it is typically represented as a square matrix
given a graph \(G\) with \(n\) nodes numbered from \(1\) to \(n\), the adjacency matrix \(A = (a_{i,j})\) of \(G\) is a square \(n \times n\) matrix such that \(a_{i,j} = 1\) if there exists an edge joining nodes \(i\) and \(j\), and \(a_{i,j} = 0\) otherwise

Undirected graphs

a graph is undirected if edges have no direction: if there is an edge from \(i\) to \(j\) , then there is also an edge from \(j\) to \(i\)
this means that the adjacency matrix of an undirected graph is symmetric: \(a_{i,j} = a_{j,i}\) for every pair \(i,j\), or \(A = A^T\), where \(A^T\) is the transpose of \(A\)
if there is and edge between \(i\) and \(j\), then \(i\) and \(j\) are said to be neighbors
the neighbors of node \(i\) are the 1s of row (or column) \(i\) of \(A\)

Undirected graphs

library(igraph)
edges = c(1,2, 1,5, 2,3, 2,4, 3,4, 3,5, 3,6)
ug = graph(edges, directed=FALSE)
coords = layout_with_fr(ug)
plot(ug, layout=coords,
     vertex.size = 20, vertex.color = "lightblue", 
     edge.width = 3, edge.color = "black")

(A = as_adjacency_matrix(ug, sparse=FALSE))

##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    0    1    0    0    1    0
## [2,]    1    0    1    1    0    0
## [3,]    0    1    0    1    1    1
## [4,]    0    1    1    0    0    0
## [5,]    1    0    1    0    0    0
## [6,]    0    0    1    0    0    0

Directed graphs

a graph is directed if edges have a direction: if there is an edge from \(i\) to \(j\), then there might be or not the inverse edge
this means that the adjacency matrix of an directed graph is not necessarily symmetric
if there is and edge from \(i\) to \(j\), then \(i\) is a predecessor of \(j\) and \(j\) is a successor of \(i\)
the predecessors of node \(i\) are the 1s on column \(i\) of \(A\) and the successors of node \(i\) are the 1s on row \(i\) of \(A\)
self-loops or self-edges are edges \((i,i)\) from a node \(i\) to itself: they correspond to diagonal entries in the adjacency matrix

Directed graphs

edges = c(1,3, 1,4, 2,5, 3,2, 4,4, 4,5, 5,3, 5,6, 5,4)
dg = graph(edges, directed=TRUE)
coords = layout_with_fr(dg)
plot(dg, layout=coords,
     vertex.size = 20, vertex.color = "lightblue", 
     edge.width = 3, edge.color = "black")

(A = as_adjacency_matrix(dg, sparse=FALSE))

##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    0    0    1    1    0    0
## [2,]    0    0    0    0    1    0
## [3,]    0    1    0    0    0    0
## [4,]    0    0    0    1    1    0
## [5,]    0    0    1    1    0    1
## [6,]    0    0    0    0    0    0

Simple and multigraphs

In some cases there can be more than one edge between the same pair of vertices; we refer to those edges collectively as a multiedge
a network that has no multiedges is called a simple graph, otherwise a multigraph

Simple and multigraphs

edges = c(1,2, 1,5, 1,5, 1,5, 2,3, 2,3, 2,4, 3,4, 3,5, 3,6, 6,6)
mug = graph(edges, directed=FALSE)
coords = layout_with_fr(mug)
plot(mug, layout=coords,
     vertex.size = 20, vertex.color = "lightblue", 
     edge.width = 3, edge.color = "black")

(A = as_adjacency_matrix(mug, sparse=FALSE))

##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    0    1    0    0    3    0
## [2,]    1    0    2    1    0    0
## [3,]    0    2    0    1    1    1
## [4,]    0    1    1    0    0    0
## [5,]    3    0    1    0    0    0
## [6,]    0    0    1    0    0    1

Unweighted and weighted graphs

In some situations it is useful to represent edges as having a strength, weight, or value to them:

in the Internet edges might have weights representing the amount of data flowing along them or their bandwidth
in a food web predator-prey interactions might have weights measuring total energy flow between prey and predator
in a social network connections might have weights representing the sign and intensity of the relationship: positive weights denote friendship and negative ones represent animosity

Unweighted and weighted graphs

edges = c(1,2, 1,5, 2,3, 2,4, 3,4, 3,5, 3,6)
wg = graph(edges, directed=FALSE)
E(wg)$weight = c(1, 5, 2, 8, 4, 5, 6)
coords = layout_with_fr(wg)
plot(wg, layout=coords,
     vertex.size = 20, vertex.color = "lightblue", 
     edge.width = 1, edge.color = "grey", edge.label = E(wg)$weight)

(A = as_adjacency_matrix(wg, sparse=FALSE, attr = "weight"))

##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    0    1    0    0    5    0
## [2,]    1    0    2    8    0    0
## [3,]    0    2    0    4    5    6
## [4,]    0    8    4    0    0    0
## [5,]    5    0    5    0    0    0
## [6,]    0    0    6    0    0    0

Play

a path in a network is any sequence of nodes such that every consecutive pair of vertices in the sequence is connected by an edge in the network
a simple path is a path with no repetitions of nodes and edges allowed
the length of a path is the number of edges of the path
given a simple graph, define a matrix \(N^{(r)}\) such that \(N_{i,j}^{(r)}\) is the number of (not necessarily simple) paths of length \(r \geq 1\) between nodes \(i\) and \(j\) in terms of the graph adjacency matrix
what is \(N_{i,i}^{(r)}\)?

Solution

Notice that \[N_{i,j}^{(1)} = A_{i,j}\] is the number of paths of length 1 (edges) from \(i\) to \(j\).

The product \(A_{i,k} A_{k,j}\) is \(1\) if and only if there is a path of length 2 from \(i\) to \(j\) (that goes through \(k\)). Hence, the number of paths of length 2 from \(i\) to \(j\) is:

\[N_{i,j}^{(2)} = \sum_{k=1}^{n} A_{i,k} A_{k,j} = [A^{2}]_{i,j}\]

where \([A^{2}]_{i,j}\) is the \((i,j)\) entry of matrix \(A^2 = A A\).

More generally, the number of paths of length \(r\) from \(i\) to \(j\) is:

\[N_{i,j}^{(r)} = [A^{r}]_{i,j}\]

It follows that \(N_{i,i}^{(r)}\) is the number of cycles of length \(r\) that contain node \(i\).

Play

Given and adjacency matrix \(A\) for a graph, matrices \(AA^T\) and \(A^TA\) are called projection matrices. Give a meaning to these projections on a directed graph.

Solution

In directed networks, we have that:

\[[AA^T]_{i,j} = \sum_{k=1}^{n} A_{i,k} A_{j,k}\]

is the number of common successors of \(i\) and \(j\) with \([AA^T]_{i,i}\) is the number of successors of \(i\) and

\[[A^TA]_{i,j} = \sum_{k=1}^{n} A_{k,i} A_{k,j}\]

is the number of common predecessors of \(i\) and \(j\) with \([A^TA]_{i,i}\) is the number of predecessors of \(i\).

Bipartite graphs

a bipartite graph is a graph where there are two kinds of vertices, and edges run from nodes of different types only
any network in which the vertices are connected together by common membership of groups of some kind can be represented in this way
in sociology such networks are called affiliation networks: for instance, scientists coauthoring papers or film actors appearing together in films
a bipartite network is typically represented with an incidence matrix: if \(n\) is the number of actors in the network and \(g\) is the number of groups, then the incidence matrix \(B\) is a \(g \times n\) matrix having elements \(B_{i,j} = 1\) if group \(i\) contains participant \(j\) and \(B_{i,j} = 0\) otherwise

Bipartite graphs

types = c(rep(TRUE,7), rep(FALSE,4))
edges = c(8,1, 8,2, 8,3, 9,2, 9,3, 9,4, 9,5, 10,4, 10,6, 11,5, 11,6, 11,7)
bg = make_bipartite_graph(types, edges, directed=FALSE)
lay = layout.bipartite(bg)
plot(bg, layout=lay[,2:1], 
     vertex.color=c("skyblue","green")[V(bg)$type + 1], 
     vertex.size = 20)

(B = as_incidence_matrix(bg))

##    1 2 3 4 5 6 7
## 8  1 1 1 0 0 0 0
## 9  0 1 1 1 1 0 0
## 10 0 0 0 1 0 1 0
## 11 0 0 0 0 1 1 1

Play

A bipartite graph is a special case of undirected graph, hence it can be represented with a symmetric adjacency matrix as well.

Write the adjacency matrix \(A\) of a bipartite graph in terms of the incidence matrix \(B\) of the graph.

Solution

\[A = \begin{pmatrix} 0 & B^T \cr B & 0 \end{pmatrix}\] where \(0\) are square matrices of size \(n \times n\) (upper) and \(g \times g\) (lower).

# incidence matrix
(B = as_incidence_matrix(bg))

##    1 2 3 4 5 6 7
## 8  1 1 1 0 0 0 0
## 9  0 1 1 1 1 0 0
## 10 0 0 0 1 0 1 0
## 11 0 0 0 0 1 1 1

# adjacency matrix
(A = as_adjacency_matrix(bg, sparse = FALSE))

##       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
##  [1,]    0    0    0    0    0    0    0    1    0     0     0
##  [2,]    0    0    0    0    0    0    0    1    1     0     0
##  [3,]    0    0    0    0    0    0    0    1    1     0     0
##  [4,]    0    0    0    0    0    0    0    0    1     1     0
##  [5,]    0    0    0    0    0    0    0    0    1     0     1
##  [6,]    0    0    0    0    0    0    0    0    0     1     1
##  [7,]    0    0    0    0    0    0    0    0    0     0     1
##  [8,]    1    1    1    0    0    0    0    0    0     0     0
##  [9,]    0    1    1    1    1    0    0    0    0     0     0
## [10,]    0    0    0    1    0    1    0    0    0     0     0
## [11,]    0    0    0    0    1    1    1    0    0     0     0

Play

A projection of a bipartite graph is a weighted undirected graph such that:

the nodes are the vertices of the bipartite graph of one type (hence, there are two projections)
there is an edge between two nodes of the same type if they have a neighbor in common in the bipartite graph
the weight of the edge is the number of common neighbors

Write the adjacency matrices of the two projections of a bipartite graph in terms of its incidence matrix \(B\) and code one example. What is the value on the diagonal of the projections?

Solution

If \(B\) is the incidence matrix of a bipartite graph with \(n\) actors and \(g\) groups, then the one-mode projection graph on actors is the \(n \times n\) symmetric matrix \[B^T B\] and the one-mode projection graph on groups is the \(g \times g\) symmetric matrix \[B B^T\]

# incidence matrix
(B = as_incidence_matrix(bg))

##    1 2 3 4 5 6 7
## 8  1 1 1 0 0 0 0
## 9  0 1 1 1 1 0 0
## 10 0 0 0 1 0 1 0
## 11 0 0 0 0 1 1 1

# first projection
t(B) %*% B

##   1 2 3 4 5 6 7
## 1 1 1 1 0 0 0 0
## 2 1 2 2 1 1 0 0
## 3 1 2 2 1 1 0 0
## 4 0 1 1 2 1 1 0
## 5 0 1 1 1 2 1 1
## 6 0 0 0 1 1 2 1
## 7 0 0 0 0 1 1 1

# second projection
B %*% t(B)

##    8 9 10 11
## 8  3 2  0  0
## 9  2 4  1  1
## 10 0 1  2  1
## 11 0 1  1  3

Regular graphs

A regular graph is defined as follows:

An unweighted undirected graph is regular if there is an integer \(r > 0\) such that all nodes have degree equal to \(r\).
An unweighted directed graph is regular if there is an integer \(r > 0\) such that all nodes have out-degree and in-degree equal to \(r\).

Weighted degree

in a weighted undirected graph, the weighted degree of a node is the sum of weights of edges incident in the node
in a weighted directed graph, we distinguish between weighted out-degree (the sum of weights of edges leaving the node) and weighted in-degree (the sum of weights of edges entering the node)

Regularizable graphs

A regularizable graph is defined as follows:

An unweighted undirected graph is regularizable if the edges of the graph can be weighted with positive integers and in the resulting weighted graph all nodes have weighted degree equal to some \(r > 0\).
An unweighted directed graph is regularizable if the edges of the graph can be weighted with positive integers and in the resulting weighted graph all nodes have weighted out-degree and weighted in-degree equal to some \(r > 0\).

Regularizable graphs

a regular graph is regularizable but there are regularizable graphs that are not regular
not all graphs are regularizable: it is easy to see that a star graph with at least 3 nodes is not regularizable

library(igraph)
plot(make_star(3, mode = "undirected"))

Play

Consider the following two undirected graphs. The left graph is not regularizable: can you tell why? The graph on the right is regularizable: can you find weights for the edges to prove it?

Solution

on the left, suppose you put a weight of \(\alpha > 0\) on edge \((1,2)\) and a weight of \(\beta > 0\) on edge \((1,3)\). Then node 2 has degree \(\alpha\) and node 1 has degree \(\alpha + \beta > \alpha\).
the graph on the right is regularizable: if we label the outer edges with \(3\) and the inner edges with \(2\) we have a positive regularizability solution with regularization degree \(8\).

Play

Consider the following two directed graphs. Find regularizable solution for graph on the right. Is graph on the left regularizable?

Solution

a regularizable solution for the right graph labels all edges with 1 and edge from 4 to 1 with 2
there is no regularizable solution for the graph on the left since nodes 2 and 4 (or 1) will necessarily have different degrees.

Overview

Undirected graphs

Undirected graphs

Directed graphs

Directed graphs

Simple and multigraphs

Simple and multigraphs

Unweighted and weighted graphs

Unweighted and weighted graphs

Play

Solution

Play

Solution

Bipartite graphs

Bipartite graphs

Play

Solution

Play

Solution

Regular graphs

Weighted degree

Regularizable graphs

Regularizable graphs

Play

Solution

Play

Solution

Dig deeper