Social Networks

Social networks are networks in which the nodes are people, or sometimes groups of people, and the edges represent some form of social interaction between them, such as friendship. Sociologists have developed their own language for discussing networks: they refer to the vertices, the people, as actors and the edges as ties.

To most people the words social networks mean social networking services such as Facebook and Twitter. In fact, the study of social networks goes back much farther. Among researchers who study networks, sociologists have the longest and best established tradition of quantitative, empirical study.

The true foundation of the field is attributed to psychiatrist Jacob Moreno, a Romanian immigrant to America who in the 1930s became interested in the dynamics of social interactions of groups of people. Moreno published in 1934 a book entitled Who Shall Survive? which contained the seeds of the field of sociometry, which later became social network analysis. Moreno called his diagrams sociograms, rather than social networks. The first example of social network was a hand-drawn image depicting friendship patterns between the boys and the girls in a class of schoolchildren. The figure reveals that there are many friendships between two boys and two girls, but few between a boy and a girl.

Another early study of social networks is the affiliation network of the so-called Southern Women Study, published in 1941 in a book entitled Deep South. Davis, Gardner and Gardner made use of the newspaper reports of public appearance of society women to study a social network of 18 women in a city in the American south. They took a sample of 14 social events attended by the women in question and recorded which women attended which events. Women in this network may be considered linked if they attended the same event. An alternative and more complete representation of the data is an affiliation network or bipartite graph, a network with two types of vertex, representing the women and the events, with edges connecting each women to the events she attended. Women were found by the researchers to split into two subgroups, tightly knit clusters of acquaintances with only rather loose between-cluster interaction.

One thing to appreciate about social networks is that there are many different possible definitions of an edge is such a network. Edges might represent friendship, professional relationships, exchange of goods and money, communication patterns, romantic or sexual relationships, or many other types of connection. Since Moreno and Davis et al., social network analysis has been applied to a variety of different communities, including friendship and acquaintance patterns in the population, among students, or schoolchildren, contacts between business people and other professionals, boards of directors of companies, collaborations of scientists, movie actors, and musicians, sexual contact networks and dating patterns, criminal networks such as networks of drug users and terrorists, historical networks, online communities such as Usenet, Facebook and Twitter, and social networks of animals.

Measuring a social network

A crucial issue is the study of social networks is the empirical method for accumulating data on the network. Two techniques are most used: direct questioning of subjects and the use of archival records.

The most common method is simply to ask people questions. If you are interested in friendship networks, for instance, then you ask people who their friends are. The asking may take the form of direct interviews with participants or the completion by participants of questionnaires. The main disadvantages of networks studies based on direct questioning of participants are that they are first laborious and second inaccurate. The administering of interviews or questionnaires and the collation of responses is a demanding job. For this reason, most studies have been limited to a few tens or at most hundreds of actors. Moreover, answers given by respondents are always, to some extent, subjective. If you ask people who their friends are, different people will interpret friendship in different ways and thus give different kinds of answers.

An increasing important, voluminous, and ofter highly reliable source of social network data is archival records. Such records are often impressive in their scale allowing us to construct networks of large size. A number of authors have looked at email networks. Drawing on email logs - automatic records kept by email servers of messages sent - or on email address books, it is possible to construct networks in which the vertices are people (or more correctly email addresses) and the directed edges are email messages sent between them. Such networks can be taken as a proxy of acquaintance networks. Moreover, a knowledge of the structure of email networks may help us to predict and control the spread of computer viruses carried by email messages.

Recent years have seen the rapid emergence of online social networking services like Facebook, Twitter, and LinkedIn. As a natural part of their operation, these services build records of connections between their participants and hence provide, at least in principle, a rich archival network data. These data, however, are largely proprietary to the companies operating the services and hence quite difficult to get hold of.

Other examples of online communities, not explicitly oriented towards networks, from which network data can be extracted, are dating websites, newsgroups, and weblogs. For instance, one can reconstruct the threads of conversations taking place in a newsgroup, and assemble a network in which the vertices are posters and the edges represent a response by one poster to a posting by another. On weblogs, the proprietors post whatever they want to make public, along with links to sites maintained by others, often friends and acquaintances. These links form a directed network that lies between a social network and the World Wide Web.

An important special case of the reconstruction of networks from archival records is the affiliation network. An affiliation network is a network in which actors are connected via co-membership of groups of some kind. The most complete representation of an affiliation network is as a bipartite graph, where the networks has two types of vertex representing the actors and the groups, with edges connecting the actors to the groups to which they belong. An historical example is the mentioned Southern Women Study of Davis et al. Perhaps the best known example is the network of collaboration of film actors, in which the actors in the network sense are the actors in the dramatic sense, and the groups to which they belong are the film casts. The network is the basis of a popular parlor game, sometimes called the Six Degrees of Kevin Bacon, in which one tries to connect an actor to Kevin Bacon via chains of intermediate costars. Another example of a large affiliation network is the collaboration network of academics. In this network an actor is an academic author and a group is the set of authors of a learned paper. Excellent and very comprehensive archives exist in many academic fields, from which large collaboration networks can be assembled and studied.

An interesting network that has features of both a social and a technological network is the network of trust formed by a set of cryptographic keys. The act of digitally signing someone else's public key in the context of public-key cryptography is equivalent to say that you know, or at least believe, the public key to be genuine, belonging to the person its claims to belong to. The act can be represented as a directed edge in a network. The vertices in the network represent the parties involved and a directed link from party A to party B indicates that A has signed B's public key, that is, A trusts B. The resulting network certainly has technological aspects but also social implications, since people tend to vouch for the keys of other people they know and they trust.

Another way to measure a network is direct observation: simply by watching interactions between actors one can, over a period of time, form a picture of the networks of unseen ties that exists between them. One arena in which direct observation is essentially the only viable experimental technique to assemble a network is studies of the social networks of animals - clearly animals cannot be surveyed using interviews or questionnaires. Informative studies have been performed for monkeys, kangaroos, and dolphins.

A network may change over time and sometimes network data are time-resolved: the date of each interaction between pairs of vertices, which forms an edge of the network, is recorded. For instance, collaboration network data are often time-resolved since bibliographies contain at least the year of each recorded academic publication. Hence, collaboration links between authors can be stamped with the year in which the collaboration took place. Time-resolved network studies, or longitudinal studies, as they are called in sociology, allow for a temporal analysis of the network, that is how network properties changed over time.

The small-world experiment

An important contribution to the social networks literature was made by the experimental psychologist Stanley Milgram in the 1960s with his now-famous small-world experiment. Milgram was interested to quantifying the typical distance between actors in a social network. The geodesic distance between two vertices in a network is the length, in terms of number of edges, of a shortest path connecting the two nodes. Milgram wanted to measure the typical geodesic distance in real networks and to do this he concocted the following experiment. Milgram sent a set of packages, 96 in total, to recipients randomly chosen from the telephone directory in the US town of Omaha, Nebraska. The packages contained written instructions asking the recipients to attempt to get an included passport to a specified target individual, a friend of Milgram's who lived in Boston, Massachusetts, over a thousand miles away. The only information supplied about the target was his name, address, and occupation. But the passport holders were not allowed simply to send the passport to the given address. Instead, they were asked to pass it to someone they knew on a first-name basis and more specifically the person in this category who they felt would stand the best chance of getting the passport to the intended target. Each receiver was asked to repeat the process so that after a succession of such steps, a path on the social network of acquaintances, the passport would find its way to the target person. The length of such path provided an upper bound on the geodesic distance in the social network between the starting and ending individuals.

Of the 96 passports sent out, 18 found their way to the target in Boston. Milgram asked participants to record in the passport each step of the path taken. He found that the mean length for completed paths was 5.9 steps. This result is the origin of the idea of the six degrees of separation, the popular belief that there are only about six steps between any two people in the world. The phrase six degrees of separation is in fact more recent and comes from a popular Broadway play by John Guare, later made into a film, in which the lead character discusses Milgram's work.

For many reasons Milgram's experiment should be taken with a large pinch of salt. Milgram used only a single target; all the initial recipients were in a single town; there is no guarantee that the chains took the shortest possible route to the target; finally, most of the chains were not completed. Even so, the fundamental result that vertex pairs on social networks tend on average to be connected by short paths with respect to the number of nodes of the network is now widely accepted, and has moreover been shown to extend to many other kind of networks.

Milgram found that most of the passports that did find their way to the target did so via just three of the target's friends, that Milgram called sociometric superstars. This phenomenon, that is the fact that a large number of the paths from any vertex to the rest of the network go through a limited number of neighbours of the vertex, is sometimes referred to as the funneling effect.

Having the complete network at disposal, the problem of finding the shortest path between two nodes is easy to resolve using a breath-first visit of the graph. Kleinberg noticed that, despite participants in Milgram's experiment had only local information about their neighbourhood of acquaintances, they managed someway to find remarkably short paths to the target. This happened because participants had a good knowledge about which of their friends was the closest to the target in terms of geodesic distance and they chose to send the passport to this closest friend. Related to this, Killworth and Bernard made a reverse small-world experiment in which they asked participants to imagine that they were taking part is a small-world experiment. Participants were asked what they wanted to know about the target person in order to make a decision about whom to forward the message. Interestingly, the three characteristics more often indicated were the same three pieces of information that Milgram provided in his original experiment, namely name, geographic location, and occupation.