Signed networks

  • in our discussion of networks thus far, we have generally viewed the relationships contained in these networks as having positive connotations
  • the terminology of on-line social networks reflects a largely similar view, through its emphasis on the connections one forms with friends, fans, followers, and so forth
  • but in most network settings, there are also negative effects at work. Some relations are friendly, but others are antagonistic or hostile
  • how should we reason about the mix of positive and negative relationships that take place within a network?
  • this part is taken from Chapter 12 (Positive and Negative Relationships) of book Networks, crowds and markets

Structural balance

  • suppose we have a undirected social network on a set of people, in which everyone knows everyone else, no two people are indifferent to one another, or unaware of each other: a clique, or a complete graph
  • we then label each edge with either + or −; a + label indicates that its two endpoints are friends, while a − label indicates that its two endpoints are enemies
  • the model we’re considering makes the most sense for a group of people small enough to have this level of mutual awareness, or for a setting such as international relations, in which the nodes are countries and every country has an official diplomatic position toward every other

Structural balance

  1. A friend of a friend will be a friend
  2. A friend of an enemy will be an enemy
  3. An enemy of a friend will be an enemy
  4. An enemy of an enemy will be a friend

Structural balance

  • the principles underlying structural balance are based on theories in social psychology dating back to the work of Heider in the 1940s, and generalized and extended to the language of graphs beginning with the work of Cartwright and Harary in the 1950s
  • when we look at sets of three people at a time, certain configurations of +’s and −’s are socially and psychologically more plausible than others
  • balanced (or positive) triangles have an even number of negative signs (0 or 2), or the multiplication of the edge signs is positive
  • unbalanced (or negative) triangles have an odd number of negative signs (1 or 3), or the multiplication of the edge signs is negative
  • a network is balanced if all triangles in it are balanced
  • the argument of structural balance theorists is that because unbalanced triangles are sources of stress or psychological dissonance, people strive to minimize them in their personal relationships, and hence they will be less abundant in real social settings

Characterizing the structure of balanced networks

We say that a signed complete network has the mutual antagonism property if:

all pairs of nodes are friends, or else the nodes can be divided into two groups, X and Y, such that every pair of nodes in X like each other, every pair of nodes in Y like each other, and everyone in X is the enemy of everyone in Y.

Balance Theorem: A signed complete graph is balanced if and only if it has the mutual antagonism property [Frank Harary, 1953]

  • the Balance Theorem is not at all an obvious fact, nor should it be initially clear why it is true
  • essentially, we’re taking a purely local property, namely that all triangles are balanced, which applies to only three nodes at a time, and showing that it implies a strong global property: either everyone gets along, or the world is divided into two battling factions

Proving the Balance Theorem

If a network has the mutual antagonism property then it is balanced

You can check that such a network is balanced: a triangle contained entirely in one group or the other has three + labels, and a triangle with two people in one group and one in the other has exactly one + label.

Proving the Balance Theorem

If a network is balanced then it has the mutual antagonism property

  • let’s pick any node in the network - we’ll call it A - and consider things from A’s perspective

  • every other node is either a friend of A or an enemy of A

  • we can assume A has both friends and enemies; indeed if all nodes of the network have only friends, the network has the desired property; if all nodes have only enemies, then the network is not balanced

  • thus, natural candidates to try for the sets X and Y would be to define X to be A and all its friends, and define Y to be all the enemies of A

  • this is indeed a division of all the nodes, since every node is either a friend or an enemy of A

Approximately balanced networks

What if we only know that most triangles are balanced?

It turns out that the conditions of the theorem can be relaxed in a very natural way, allowing us to prove the following statement:

Let \(\epsilon\) be any number such that \(0 \leq \epsilon < 1/8\), and define \(\delta = \sqrt[3]\epsilon\). If at least \(1 − \epsilon\) of all triangles in a labeled complete graph are balanced, then either

  1. there is a set consisting of at least \(1 - \delta\) of the nodes in which at least \(1 - \delta\) of all pairs are friends, or else
  2. the nodes can be divided into two groups, X and Y, such that
    • at least \(1 - \delta\) of the pairs in X like each other,
    • at least \(1 - \delta\) of the pairs in Y like each other, and
    • at least \(1 - \delta\) of the pairs with one end in X and the other end in Y are enemies.

For instance:

  • if \(\epsilon = 0\) then \(\delta = 0\) and we get the original balance theorem
  • if \(\epsilon = 10^{-n}\) then \(\delta = 10^{-n/3}\): for instance, if \(\delta = 0.1\) (90% of the mutual antagonism property) the request is \(\epsilon = 0.001\) (99.9% of the balance share)
  • it is possible to show that this relationship between \(\epsilon\) and \(\delta\) is in fact essentially the best one can do

Structural balance in arbitrary networks

  • let’s consider the case of a social network that is not necessarily complete — that is, there are only edges between certain pairs of nodes, but each of these edges is still labeled with + or −
  • so now there are three possible relations between each pair of nodes: a positive edge, indicating friendship; a negative edge, indicating enmity; or the absence of an edge, indicating that the two endpoints do not know each other

Defining balance for general networks

  • a local definition: we could then say that the graph is balanced if it is possible to fill in all the missing labeled edges in such a way that the resulting signed complete graph is balanced
  • a global definition: alternately, we could take a more global view, viewing structural balance as implying a division of the network into two mutually opposed sets of friends (to the extent that they know each other)

Defining balance for general networks

These two ways of defining balance are equivalent: an arbitrary signed graph is balanced under the first definition if and only if it is balanced under the second definition.

  • if a signed graph is balanced under the first definition, then after filling in all the missing edges appropriately, we have a signed complete graph to which we can apply the Balance Theorem. This gives us a division of the network into two sets X and Y that satisfies the properties of the second definition

  • on the other hand, if a signed graph is balanced under the second definition, then after finding a division of the nodes into sets X and Y, we can fill in positive edges inside X and inside Y, and fill in negative edges between X and Y , and then we can check that all triangles will be balanced

Characterizing the structure of balanced general networks

  • the sign of a path is the product of the signs of its edges
  • a positive path is a path with positive sign (with an even number of negative signs)
  • a negative path is a path with negative sign (with an odd number of negative signs)

A signed graph is balanced if and only if all cycles are positive

Play

Is the following graph balanced? Why?

Solution

It is not balanced since it contains the following negative cycle (with 5 negative edges): 2, 5, 6, 11, 10, 12, 7, 4, 2.

Notice that flipping the sign of edge (2,4) or erasing this edge would make the network balanced.

Approximately balanced general networks

How can we quantify the degree of balanceness of a network?

  • let \(A\) be the adjacency matrix of a signed network, that is a matrix in which positive edges have a weight of +1 and negative edges have a weight of -1
  • let \(|A|\) be the unsigned adjacency matrix corresponding to \(A\), that is a matrix in which all edges (both positive and negative) have a weight of +1
  • there exists no shared notion of approximately balance in the case of general, non-complete networks, but different proposals exist

Triangle index

The triangle index computes the fraction of balanced triangles (cycles of length 3) over all triangles

\[ T = \frac{\mathrm{tr}(A^3) + \mathrm{tr}(|A|^3)}{2 \,\mathrm{tr}(|A|^3)} \] where \(|A|\) is the unsigned matrix of \(A\). Notice that:

  • each positive triangle is counted twice at numerator and each negative triangle is counted 0 at numerator. Hence the numerator is twice the number of positive triangles
  • the denominator is twice the number of unsigned triangles
  • the index in hence between 0 and 1
  • a complete graph is balanced iff the triangle index is 1
  • this is not true for a general graph: if a general graph is balanced (all cycles are positive), then the triangle index is 1
  • however if the triangle index is 1, then the graph can be unbalanced, since there might be negative cycles longer than 3

Walk index

The walk index it is the ratio of signed positive cycles to unsigned cycles of arbitrary length.

It was proposed by Ernesto Estrada in this paper.

Recall that, if \(A\) is a matrix (or a number):

\[e^A = \sum_{s=0}^{\infty} \frac{A^s}{s!}\]

Notice that:

  • \((|A|^s)_{i,i}\) is the number of cycles of length \(s\) containing node \(i\)
  • \((A^s)_{i,i}\) is the net number of cycles of length \(s\) containing node \(i\), that is the number of positive cycles minus the number of negative cycles

Then, if \(\lambda_j\) and \(\mu_j\) are the eigenvalues of \(A\) and \(|A|\), respectively, the walk index is:

\[ K = \frac{\sum_{s=0}^{\infty} \mathrm{tr}(A^s) / s!}{\sum_{s=0}^{\infty} \mathrm{tr}(|A|^s) / s!} = \frac{\mathrm{tr}(e^A)}{\mathrm{tr}(e^{|A|})} = \frac{\sum_{j=1}^{n} e^{\lambda_j}}{\sum_{j=1}^{n} e^{\mu_j}} \] Notice the factor \(s!\): it gives more weight to dyads, then triads, squares and so on. Hence, shorter negative cycles are more important (more stressful for the system) than longer ones in this definition.

  • \(K = 1\) if and only if the graph is balanced
  • a result proved by Acharya in 1980 claims that for any signed graph, the matrices \(A\) and \(|A|\) are isospectral if and only if the signed graph is balanced
  • the sum at numerator is biased towards positive paths, since a multiple of a positive path (taking the path more times) is still positive, but any even multiple of a negative path is positive
  • \(K > 0\) and the lower bound is only asymptotically obtained for highly imbalanced graphs for which the number of negative paths approaches the number of positive ones
# A graph with only negative self-loops
A = diag(-1, 3)
B = abs(A)
# Eigenvalues of a diagonal matrix are on the diagonal of the matrix
lambda = diag(A)
mu = diag(B)
# Walk index (close to 0)
(a = sum(exp(1)^lambda))
## [1] 1.103638
(b = sum(exp(1)^mu))
## [1] 8.154845
(k = a / b)
## [1] 0.1353353

Frustration index

The frustration number is the minimum number of edges whose deletion makes the network balanced

  • the frustration index is the frustration number normalized in [0, 1] such that it is maximal (1) if the network is balanced
  • the frustration index computation is an NP-hard problem
  • it was proposed by Samin Aref & Zachary Neal

Play

Prove that the problem of computing the frustration number is NP-hard.

Hint: show the equivalence of the special case with all edges being negative with the Max-Cut problem, the problem of finding the largest cut in a graph.

Applications of structural balance - international politics

  • international politics represents a setting in which it is natural to assume that a collection of nodes all have opinions (positive or negative) about one another
  • here the nodes are nations, and + and − labels indicate alliances or animosity
  • research in political science has shown that structural balance can sometimes provide an effective explanation for the behavior of nations during various international crises
  • this also reinforces the fact that structural balance is not necessarily a good thing: since its global outcome is often two implacably opposed alliances, the search for balance in a system can sometimes be seen as a slide into a hard-to-resolve opposition between two sides

Applications of structural balance - social networks

  • the terminology of on-line social networks emphasis on the positive connections one forms with friends, fans, followers, and so forth
  • but in most network settings, there are also negative effects at work
  • a growing source for network data with both positive and negative edges comes from user communities on the Web where people can express positive or negative sentiments about each other
  • for instance in the technology news site Slashdot the users can designate each other as a friend or a foe
  • on on-line e-commerce sites such as eBay or vacantial rental marketplaces like airbnb, users can express trust or distrust of other users
  • structural balance theory can be used to predict new links among users (and recommend products to users) not only based of positive relations but also on negative ones (for instance, why not link to the enemy of my enemy?)

Centrality measures on signed networks

  • degree and recursive centrality measures can be extended to networks with signs or negative weights on edges
  • as for degree, we have four versions:
    • number of friends or sum of positive weights
    • number of enemies or sum of negative weights
    • difference between the two above
    • number of friends / (number of friends + number of enemies)
  • as for recursive metrics, we thesis has to be updated as follows:
    • to be friend of a positive node increases my score
    • to be enemy of a positive node decreases my score
    • to be friend of a negative node decreases my score
    • to be enemy of a negative node increases my score
  • notice that on arbitrary matrices including negative weights, the Perron-Frobenious Theorem does not hold anymore
  • therefore the adjacency matrix may not have a unique dominant eigenvalue/eigenvector: in this case the measure does not make sense

Code

The signnet package brings together methods that have been developed to analyse signed networks. This includes:

The package includes two well known datasets:

  • the tribes dataset is a signed social network of tribes of the Gahuku–Gama alliance structure of the Eastern Central Highlands of New Guinea. The network contains sixteen tribes connected by friendship (“rova”) and enmity (“hina”)
  • the cowList dataset contains a list of 52 signed networks of inter-state relations over time (1946-1999). Two countries are connected by a positive tie if they form an alliance or have a peace treaty. A negative tie exists between countries who are at war or in other kinds of conflicts
library(igraph)
library(signnet)
library(ggplot2)

# generate a balanced signed network with two sets of 10 nodes
g = sample_islands_signed(islands.n = 2, islands.size = 10,
                           islands.pin = 0.8, n.inter = 5)

# visualize network
ggsigned(g)

# count signed triangles
count_signed_triangles(g)
## +++ ++- +-- --- 
## 152   0   6   0
# print signed tringles (P is the number of positive signs)
head(signed_triangles(g))
##      V1 V2 V3 P
## [1,]  1  2  8 3
## [2,]  1  2  7 3
## [3,]  1  3 17 1
## [4,]  1  3  2 3
## [5,]  1  3  4 3
## [6,]  1  3  5 3
# recall g is balanced
balance_score(g, method = "triangles")
## [1] 1
balance_score(g, method = "walk")
## [1] 1
# Note that the problem is NP complete and only a lower bound 
# of the index is returned (based on simulated annealing, 
# which is not deterministic)
balance_score(g, method = "frustration")
## [1] 0.8390805
# use tribes dataset
data("tribes")

## visualize graph
ggsigned(tribes)

# adj matrix
as_adj_signed(tribes)[1:5,1:5]
##       Gavev Kotun Ove Alika Nagam
## Gavev     0     1  -1    -1    -1
## Kotun     1     0  -1     0    -1
## Ove      -1    -1   0     1     0
## Alika    -1     0   1     0     0
## Nagam    -1    -1   0     0     0
# find signed triangles
count_signed_triangles(tribes)
## +++ ++- +-- --- 
##  19   2  40   7
# degree of balance
balance_score(tribes, method = "triangles")
## [1] 0.8676471
balance_score(tribes, method = "walk")
## [1] 0.3575761
balance_score(tribes, method = "frustration")
## [1] 0.7586207
# number of friends
degree_signed(tribes, type="pos")
## Gavev Kotun   Ove Alika Nagam Gahuk Masil Ukudz Notoh Kohik Geham Asaro Uheto 
##     3     3     4     2     3     5     7     6     3     2     4     4     4 
## Seuve Nagad  Gama 
##     2     3     3
# number of enemies
degree_signed(tribes, type="neg")
## Gavev Kotun   Ove Alika Nagam Gahuk Masil Ukudz Notoh Kohik Geham Asaro Uheto 
##     5     5     2     1     4     5     0     1     4     3     5     4     4 
## Seuve Nagad  Gama 
##     3     6     6
# number of friends - number of enemies
degree_signed(tribes, type="net")
## Gavev Kotun   Ove Alika Nagam Gahuk Masil Ukudz Notoh Kohik Geham Asaro Uheto 
##    -2    -2     2     1    -1     0     7     5    -1    -1    -1     0     0 
## Seuve Nagad  Gama 
##    -1    -3    -3
# number of friends / (number of friends + number of enemies)
degree_signed(tribes, type="ratio")
##     Gavev     Kotun       Ove     Alika     Nagam     Gahuk     Masil     Ukudz 
## 0.3750000 0.3750000 0.6666667 0.6666667 0.4285714 0.5000000 1.0000000 0.8571429 
##     Notoh     Kohik     Geham     Asaro     Uheto     Seuve     Nagad      Gama 
## 0.4285714 0.4000000 0.4444444 0.5000000 0.5000000 0.4000000 0.3333333 0.3333333
# signed eigenvector centrality
# the function returns an error if the adjacency matrix has not a unique dominant eigenvalue
x = eigen_centrality_signed(tribes)
names(x) = V(tribes)$name
x
##      Gavev      Kotun        Ove      Alika      Nagam      Gahuk      Masil 
## 1.00000000 0.88879594 0.71609511 0.36818769 0.74303157 0.95454966 0.76017717 
##      Ukudz      Notoh      Kohik      Geham      Asaro      Uheto      Seuve 
## 0.67100494 0.21063522 0.23890028 0.69639725 0.91352983 0.23390770 0.05855799 
##      Nagad       Gama 
## 0.91193923 0.98724909
# a graph with two dominant eigenvalues
# all nodes have two friends and two foes
A = matrix(c( 0,  1,  1, -1,  0,  0, -1,  0,  0, 
               1,  0,  1,  0, -1,  0,  0, -1,  0, 
               1,  1,  0,  0,  0, -1,  0,  0, -1, 
              -1,  0,  0,  0,  1,  1, -1,  0,  0, 
               0, -1,  0,  1,  0,  1,  0, -1,  0, 
               0,  0, -1,  1,  1,  0,  0,  0, -1, 
              -1,  0,  0, -1,  0,  0,  0,  1,  1, 
               0, -1,  0,  0, -1,  0,  1,  0,  1, 
               0,  0, -1,  0,  0, -1,  1,  1, 0), 9, 9)

g = graph_from_adjacency_matrix(A, "undirected", weighted = "sign")
plot(g, edge.label = E(g)$sign)

# eigenvalues
round(eigen(A)$values, 6)
## [1]  3  3  0  0  0  0  0 -3 -3
# returns an error
#eigen_centrality_signed(g)

Play

Consider the following unbalanced graph. Compute on it the triangle, walk and frustation indices.

Solution

library(signnet)

adj = matrix(c(  
#1   2   3   4   5   6   7   8   9  10  11  12  13  14  15
 0,  1,  1,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  #  1
 1,  0,  1, -1,  1,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  #  2
 1,  1,  0,  0,  0, -1,  0,  0,  0,  0,  0,  0,  0,  0,  0,  #  3
 0, -1,  0,  0,  0, -1, -1,  0, -1,  0,  0,  0,  0,  0,  0,  #  4
 0,  1,  0,  0,  0, -1,  0,  0,  0,  0,  0,  0,  0,  0,  0,  #  5
 0,  0, -1,  0, -1,  0,  0,  1,  0,  0, -1,  0,  0,  0,  0,  #  6
 0,  0,  0, -1,  0,  0,  0,  0,  0,  0,  0,  1,  0,  0,  0,  #  7
 0,  0,  0,  0,  0,  1,  0,  0,  0,  0, -1,  0,  0,  0,  0,  #  8
 0,  0,  0, -1,  0,  0,  0,  0,  0,  0,  0,  1,  0,  0,  0,  #  9
 0,  0,  0,  0,  0,  0,  0,  0,  0,  0, -1,  1,  0,  0,  0,  # 10
 0,  0,  0,  0,  0, -1,  0, -1,  0, -1,  0,  0, -1, -1,  0,  # 11
 0,  0,  0,  0,  0,  0,  1,  0,  1,  1,  0,  0,  1,  0,  0,  # 12
 0,  0,  0,  0,  0,  0,  0,  0,  0,  0, -1,  1,  0,  0, -1,  # 13
 0,  0,  0,  0,  0,  0,  0,  0,  0,  0, -1,  0,  0,  0, -1,  # 14
 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0, -1, -1,  0), # 15
nrow = 15, ncol = 15)

g = graph_from_adjacency_matrix(adj, "undirected", weighted = "sign")

balance_score(g, method = "triangles")
## [1] 1
balance_score(g, method = "walk")
## [1] 0.9998143
# lower bound. Exact value is 1 - 1/21 = 0.952381
balance_score(g, method = "frustration")
## [1] 0.9047619