Small-world networks

Geodesic paths and distances

a shortest path, or geodesic path, between two nodes in a graph is a path with the minimum number of edges
the length of a geodesic path is called geodesic distance or shortest distance.

Small worlds

one large-scale property of networks is the average geodesic distance between pairs of nodes in the network
one of the most discussed network phenomena is the small-world effect: in most real network the typical geodesic distance is surprisingly short, in particular when compared with the number of nodes of the network
check the small-world effect using the Erdős number on MathSciNet collaboration distance tool

Milgram’s experiment

The first experimental evidence of the small-world phenomenon is made by psychologist Stanley Milgram in the 1960s with his now-famous small-world experiment
Milgram was interested to quantifying the typical distance between actors in a social network
He sent a set of packages, 96 in total, to recipients randomly chosen from the telephone directory in the US town of Omaha, Nebraska. The packages contained written instructions asking the recipients to attempt to get an included passport to a specified target individual, a friend of Milgram’s who lived in Boston, Massachusetts, over a thousand miles away
The only information supplied about the target was his name, geographic location, and occupation
But the passport holders were not allowed simply to send the passport to the given address. Instead, they were asked to pass it to someone they knew on a first-name basis and more specifically the person in this category who they felt would stand the best chance of getting the passport to the intended target
Each receiver was asked to repeat the process so that after a succession of such steps, a path on the social network of acquaintances, the passport would find its way to the target person
The length of such path provided an upper bound on the geodesic distance in the social network between the starting and ending individuals.
Of the 96 passports sent out, 18 found their way to the target in Boston
Milgram asked participants to record in the passport each step of the path taken. He found that the mean length for completed paths was 5.9 steps
This result is the origin of the idea of the six degrees of separation, the popular belief that there are only about six steps between any two people in the world
The phrase six degrees of separation is in fact more recent and comes from a popular Broadway play by John Guare, later made into a film, in which the lead character discusses Milgram’s work.

Milgram’s experiment: from local to global information

Having the complete network at disposal, the problem of finding the shortest path between two nodes is easy to resolve using a breath-first visit of the graph
Kleinberg noticed that, despite participants in Milgram’s experiment had only local information about their neighbourhood of acquaintances, they managed someway to find remarkably short paths to the target, which is a global property of the social graph
This happened because participants had a good knowledge about which of their friends was the closest to the target in terms of geodesic distance and they chose to send the passport to this closest friend

Small world effect

the small-world effect has a substantial implication for networked systems
suppose a rumor (or a disease) is spread over a social network
clearly it will reach people much faster if it is only about six steps from any person to any other than if it is a hundred
similarly, the speed with which one gets a response from another computer on the Internet depends on how many hops data packets have to make as they traverse the network

The small world size

in fact, once one looks deeply into the mathematics of networks, one discovers that the small-world effect is not so surprising after all
mathematical models of networks suggest that path lengths in networks should typically scale as \(\log n\) with the number \(n\) of network nodes, and should therefore tend to remain small even for large networks
the rough idea behind this short distances is that the number of nodes one encounters in a breath-first visit of a graph starting from a source node typically increases exponentially with the distance to the source node, hence the distance increases logarithmically in the number of visited nodes
it is observed that distances on a graph normally distribute; this means that the average distance represents a typical value of all distances in the network
furthermore, since the distribution of distances drops off rapidly around the mean, the time-consuming computation of the exact average distance can be approximated by computing the average distance on a relatively small random sample of node pairs
both random and preferential attachment network models generate small-world networks

Code

library(igraph)

# Zachary graph
g = make_graph("Zachary")

# number of nodes
vcount(g)

## [1] 34

# average distance (degree of separation)
mean_distance(g)

## [1] 2.4082

# maximum geodesic distance (diameter)
diameter(g)

## [1] 5

# highlight the diameter
d = get_diameter(g)
V(g)$color  = "white"
E(g)$color = "grey"
E(g)$width = 1
E(g, path=d)$color = "red"
E(g, path=d)$width = 2
V(g)[d]$color = "red"
plot(g, vertex.label = NA, vertex.size=5)

# histogram of distances
distance_table(g)

## $res
## [1]  78 265 137  73   8
## 
## $unconnected
## [1] 0

paths = distance_table(g)$res
names(paths) = 1:length(paths)
barplot(paths / sum(paths), xlab="Distance", ylab="Frequency")