Tag Archives: Social Network Analysis

AAA Recap, 2013

I guess it's that time of the year. You know, when I recap, in my bittersweet way, the annual meeting of the American Anthropological Association? I am an anthropologist, yes, but I am deeply torn in my feelings for my discipline, my department, and my flagship (?) professional organization. The question mark arises because I am also a physical anthropologist and a demographer, so an argument can be made that my flagship professional organization is actually AAPA or PAA, but there is something about the unmarked category that is AAA. It's supposed to represent anthropologists, broadly construed. I honestly don't think that it does a very good job at this, but the reasons behind that are complex and I've only allocated myself a bit of time to blog since I'm desperately trying to catch up from all the travel I've done recently.

The meeting this year was in Chicago, which is a pretty amazing town. I stayed in the the Blackstone Renaissance Hotel, which was recently renovated in a lovely Art Deco theme. We did Chicago stuff. Tube steaks were eaten, the quantity of cheese that can be crammed into a deep-dish pizza was marveled at, beer was drunk.

AAA is a pretty bizarre scene. For starters, it's at the weirdest time. It seems like the peculiar timing of AAA during November must be disruptive for just about every academic anthropology department, particularly because it is nearly a week-long endeavor. It seems that the life in an American university carries on just fine without the anthropologists around for a week in the middle of the Fall term, thank you very much. A couple innovations this year struck me as particularly incongruous, given the content of much current scholarship in anthropology. First, anyone who registered for the meeting as a non-member was given a yellow badge holder to mark them as outsiders. This seemed a bit gratuitous. I'm not sure what's gained from such marking -- they already pay a substantially higher rate for the privilege of attending, do they also need to be shamed for their lack of faith? Second, in the hall outside the main bunch of conference rooms, there was a television that played a loop of anthropologists talking about how important anthropology is. This struck me as unnecessarily propagandistic and it's not at all clear to me who the target audience for this performance was. Presumably, those of us who were there already think that anthropology is a worthwhile endeavor. Seems to me that it's the rest of the world we need to convince. Once again, there appears to be almost nothing considered newsworthy to emerge from this meeting of 6,000+ scholars with the exception of a paper on the similarities in street-scanning behaviors by police and fashion scouts.

Another strange feature of AAAs is that computers, cables, remotes, laser-pointers, etc. were not provided in the conference rooms but needed to be provided by the session chairs. This is the first time I've experienced this in years at a major conference and it definitely slowed us down quite a bit at the start of our session. I'm not sure what was going on with that. Maybe the budget to pay for AV services was already spent on the fancy video production that reminded us how important we all are?

This year, I organized and chaired a session, which was sponsored by EAS, on social network analysis in evolutionary anthropology. Unfortunately for the EAS party-goers from the previous night, the session ran at 08:00 on Saturday morning. Despite this challenge, the room was packed and the audience generally seemed into it. We had great talks by Stanford's own Elly Power and Ashley Hazel. Elly talked about her amazing dissertation research on using social capital to understand costly displays of religious devotion in southern India. Ashley talked about her dissertation work in the School of Natural Resources and the Environment on mobility and the changing landscape of STI risk in Kaokoland, northern Namibia. David Nolin, one of our discipline's most talented young methodologists, presented a very clever test of generalized reciprocity using dichotomous exchange data from his work in Lamalera in Indonesia. Ben Hannowell, yet another talented methodologist to come out of the WSU/UW IGERT program, discussed his collaborative work with Zack Almquist on inferring dominance structure from tournament graphs. The always marvelous Rebecca Sear talked about her recent synthetic work on the effects of kin on fertility (kinship, of course, is the classic application of networks in anthropology since genealogies are just special cases of graphs). John Ziker presented a network-based approach to understanding food sharing and reciprocity from his terrific ethnographic work in Siberia. I closed out the talks with my own combination history of anthropological (and ethological) contributions to social network analysis and pep talk to encourage anthropologists to be confident about their methods and have the courage to innovate new ones the way people like John Barnes or Clyde Mitchell or Elizabeth Bott or Kim Romney or Russ Bernard did!

After schmoozing for a bit post-session, I headed over to the Saturday EAS session on methodological advances in experimental games. While I didn't see all the talks, the ones I saw were pretty cool. In general, I have mixed feelings about experimental economic games. There are lots of results and some fairly convincing stories to go along with some of the results. However, absent of context, I really wonder what they are measuring and, if they are indeed measuring something, whether it is actually interesting. This session made some real progress in dealing with this question and I think it really highlighted the comparative advantage of anthropologists in the multi-disciplinary landscape of twenty-first century behavioral science. While economists such as Loewenstein (1999) might lament the fact that there is no way to play context-less games and that this jeopardizes the validity and generality of such experimental games, anthropologists are experts in thinking specifically about context and its effect on behavior. Furthermore, anthropologists are still the go-to researchers for providing contextual diversity. In this session, we heard about experimental games played in Bolivia, Siberia, Fiji, and on the streets of Las Vegas. One talk in this session that particularly impressed me was given by Drew Gerkey, who is currently a post-doc at SESYNC in Annapolis, Maryland (and soon to be an assistant professor at Oregon State University -- Go Beavs!). I was at SESYNC earlier in the week and got a chance to talk pretty extensively with him about this work. Drew makes the point that seems obvious now that I've heard (a sign of an important idea) that, in the evolution of cooperation literature, the counterfactual scenario to cooperation is frequently untenable. One does not simply go it alone when one is a hunter/fisher in Siberia. Drew also designed a number of very clever experimental games that fit the types of social dilemmas faced by his Siberian interlocutors. Very nice work indeed.

In addition to the sessions I attended, it was nice to see and chat with various smart, fun people I know who sometimes find their way to AAAs. I missed my partner in crime from last year's AAA, Charles Roseman, who left the day I arrived, probably too bloated from the binge on Chicago's amazing food he no doubt shared with Fernando Armstron-Fumero to be of much use to anyone. However, I got to see Siobhan Mattison, Brooke Scelza, Brian Wood, Rick Bribiescas, Mary Shenk, Aaron Blackwell, Pete Kirby and, briefly, Shauna Burnsilver and Dan Hruschka. Despite my general misgivings about the conference, it is nice to have an excuse to see so many cool people in one place at one time.

Why the Prediction Market Failed to Predict the Supreme Court

There is a very interesting piece in the New York Times today by David Leonhardt on the apparent backlash against prediction markets such as Intrade and Betfair. In principle, these markets make predictions by aggregating the disparate information of many independent bettors who offer prices for a particular outcome. Prediction markets have enjoyed a fair amount of success in recent elections. The University of Iowa has even set up an influenza prediction market.  But prediction markets are hardly perfect and have had some pretty big recent failures. It turns out that Intrade failed in a pretty spectacular manner to predict the outcome of the recent Supreme Court ruling about the constitutionality of the Affordable Care Act. Leonhardt suggests that some of the failures of online prediction markets is attributable to relatively small number of people who actually trade on the market:

But the crowd was not everywhere wise. For one thing, many of the betting pools on Intrade and Betfair attract relatively few traders, in part because using them legally is cumbersome. (No, I do not know from experience.) The thinness of these markets can cause them to adjust too slowly to new information.

This may have been an issue with the ACA decision but the primary problem with the incorrect prediction is that the crowd doesn't actually know much about the workings of the very closed social network that is the United States Supreme Court. Writes Leonhardt:

And there is this: If the circle of people who possess information is small enough -- as with the selection of a vice president or pope or, arguably, a decision by the Supreme Court -- the crowds may not have much wisdom to impart. 'There is a class of markets that I think are basically pointless,' says Justin Wolfers, an economist whose research on prediction markets, much of it with Eric Zitzewitz of Dartmouth, has made him mostly a fan of them. 'There is no widely available public information.'

This point gets at a larger critique of market-based solutions to problems suggested by my Stanford colleague Mark Granovetter over 25 years ago (Granovetter 1985). This is the problem of embeddedness. The idea of embeddedness was anticipated by the work of substantivist economist Karl Polanyi, but Granovetter really laid out the details. Granovetter writes (1985: 487): "A fruitful analysis of human action requires us to avoid the atomization implicit in the theoretical extremes of under- and oversocialized conceptions [of human action]. Actors do not behave or decide as atoms outside a social context, nor do they adhere slavishly to a script written for them by the particular intersection of social categories that they happen to occupy. Their attempts at purposive action are instead embedded in concrete, ongoing systems of social relations." Atomization is independent bettors making decisions about the price they are willing to pay for a certain outcome.

The argument for embeddedness emerges in Granovetter's paper from the problem of trust in markets. Where does trust come from in competitive markets? The fundamental problem here regards the micro-foudnations of markets where "the alleged discipline of competitive markets cannot be called on to mitigate deceit, so the classical problem of how it can be that daily economic life is not riddled with mistrust and malfeasance has resurfaced." (p. 488). The obvious solution to this is that actors choose to deal with alters whom they trust and that the most effect way to develop trust is to have prior dealings with an alter.

Granovetter's embeddedness theory is a modest one. He notes that, unlike the alternative models, his "makes no sweeping (and thus unlikely) predictions of universal order or disorder but rather assumes that the details of social structure will determine which is found." (p. 493)

These ideas about the careful analysis of social structure and networks of interlocking relationships are fundamental for understanding when the crowd will be wise and when it will not. They are also essential for developing effective development interventions and, for that matter, making markets work for the public good in general.  The theory of embeddedness allows for the possibility that markets can work but if we are to understand when they work and when they don't, we need to think about social structure as more than just a bit of friction in an ideal market and take its measurement more seriously. People are not ideal gases. (Dirty little secret: most gases are not ideal gases). This gets at some problems that I have been thinking about a lot recently relating to the implications of additive, observational noise vs. process noise and its implications for prediction of multi-species epidemics, but that must wait for another post...

 

 

My Erdős Number

Paul Erdős was the great peripatetic, and highly prolific, mathematician of the 20th century. A terrific web page run by Jerry Grossman at Oakland University provides details of the Erdős Project. Erdős was a pioneer in graph theory, which provides the formal tools for the analysis of social networks.  A collaboration graph is a special graph in which the nodes are authors and an edge connects authors if they co-author a publication. Erdős was such a prolific collaborator that he forms a major hub in the mathematics collaboration graph, linking many disparate authors in the different realms of pure and applied mathematics.

For whatever reason, today I used Grossman's directions for finding one's number. <drum roll> My Erdős number is 4.  The path that leads me to Erdős is pretty sweet, I have to say.  This past year, I published a paper in PNAS with Marc Feldman.  Marc wrote a number of papers (here's one) with Sam Karlin (who, I'm proud to say, came and slept through at least one talk I gave at the Morrison Institute). Karlin wrote a paper with Gábor Szegő, who wrote a paper with Erdős.  Lots of Stanford greatness there that I feel privileged to be a part of. It turns out that I have independent (though longer) paths through my co-authors Marcel Salathé and Mark Handcock as well.

New Paper: Dynamics and Control of Diseases in Networks with Community Structure

Marcel Salathé and I have a brand new paper out in today's issue of the Public Library of Science, Computational Biology. There is also a news piece by Adam Gorlick in the Stanford Report this morning. This is an idea I've been bouncing around for a few years now and I was very fortunate to have Marcel – and his programming wizardry – show up with an interest in the very same topic just at the right time. It's not every day that one of the most talented young theoretical biologists in the world shows up at your office wanting to collaborate. If it ever happens to you, I suggest you act!

The fundamental question is: Does social structure affect that course of epidemics? The answer seems obvious, particularly for infectious diseases that are transmitted by direct person-to-person contact. However, specific work demonstrating the effects of social structure on epidemics can be hard to find. Part of the problem, of course, is that you can hardly do experiments in which you change social structure and then subject populations to an infectious disease. To overcome this ethical and practical barrier to research, epidemiologists, biologists, and social scientists interested in disease and human behavior use mathematical and computational models to study how changes in host behavior affect the outcome of simulated epidemics.

Two specific topics that clearly have some bearing on social structure have been investigated extensively: individual heterogeneity in contact number and individual assortativeness. Epidemic behavior in all but the simplest models has been seen as being driven by heterogeneity. When there is a lot of variance in the number of potentially infectious contacts that individuals in a population have, epidemics are more likely, they infect large segments of the population more quickly, and ultimately infect a larger fraction of the total population. Consider the extreme case where all members of a population have one contact except for one person, who has a contact with everyone else. If we were to draw a picture of such a contact network, it would resemble a star or a wheel with a central hub and spokes:

star

Infect any random individual on this star and everyone else is at risk for infection. At the opposite extreme, if everyone has exactly one contact, then a randomly infected person can infect, at most, one other individual.

couples

Assortativeness, the tendency for individuals to associate with others like themselves, can either aid or hinder the spread of infections. People in contemporary nation states like the United States show an incredible capacity to form associations with like individuals. We form social relationships, particularly intimate relationships, with people who are similar to us in age, socioeconomic status, sexual orientation, ethnicity, education, religion, forms of deviance behavior such as drug use or criminal activity, etc. Frequently, this assortativeness has the effect of localizing and concentrating epidemiologically important contacts. When this happens, individuals who act as bridges between different communities take on central epidemiological importance. For example, married men who visit commercial sex workers can serve as a critical bridge connecting high-risk populations of sex workers and injection drug users with the general population. Similarly, health care workers can bridge hospital populations with the general population, a phenomenon important for the emergence of SARS in 2002. (Note that for epidemiological applications, we call such individuals “bridges” but in other applications we might call them “brokers” or “entrepreneurs,” highlighting the general importance of such ideas for understanding society.) The existence of such social bridges highlights the fact that people can also assort on characteristics that are not visible attributes and this type of assortative behavior can increase connectivity. In particular, if people with few contacts tend to be connected to people with many contacts (as in the case of the star), then such disassortativeness can increase the epidemic potential in a population.

The aggregate effects of individual behavioral decisions can have a profound effect on the shape and composition of human populations, but there is more to human populations than simply individual behavior. For one thing, human populations are characterized by a hierarchical structure: individuals typically belong to households and households are aggregated into communities, which are, in turn, aggregated in towns, states, nations, etc. Naturally, there are cross-cutting ties in such hierarchical organization (much like bridges in individual contact networks). Freudian fantasies of primitive hordes aside, even the largely egalitarian societies of hunter-gatherers are characterized by a hierarchical structuring of families, bands, and tribes. Hierarchical structuring is clearly important for understanding social process in human societies.

So what effect does such community structure have on epidemics? To address this question, Marcel and I combined the formalisms of social network analysis and computational models of epidemics. We already know that heterogeneity in contact number can have profound effects on the outcomes of epidemics and that such heterogeneity can change aggregate social structure in complex ways. To avoid such complications, we generated networks where every individual had the exact same number of contacts. The only thing that varied in these toy networks was the likelihood that any randomly chosen connection between two individuals would be either within or between more or less cohesive subgroups (a.k.a., “communities”). Using metrics derived from Graph Theory, the branch of discrete mathematics that provides the basic tools for Social Network Analysis, we were able to characterize the degree of community structure and relate this to the outcome of epidemics simulated on the resulting networks.

It turns out that community structure has an enormous effect on epidemic outcome. In particular, we found that there is a remarkably abrupt transition from small outbreaks to very large outbreaks as we moved from the most structured populations to more moderately structured ones. Populations characterized by extreme community structure have smaller outbreaks because the infection has a hard time getting out of a community before dying out. As more connections to other communities are made – i.e., the community structure is lessened – there are more opportunities for the infection to escape and affect a larger fraction of the total population. While the result sounds intuitively satisfying after the fact, there was little precedent for expecting such an outcome in the mathematical theory of epidemics. This is because none of the standard metrics of an infectious disease – the basic reproduction ratio, in particular – changed as the populations’ community structure changed.

When we investigated the further structural network correlates of epidemic size, we found that one measure in particular predicted epidemic behavior quite well. This measure, known as “betweenness centrality,” harkens back to previous epidemiological interest in bridging individuals. A person with high betweenness lies on many of the shortest paths that connect all individuals in a network. When a person bridges two distinct subpopulations, he or she typically has high betweenness because all paths from individuals in one cluster have to pass through this person to get to the other cluster, and vice-versa. As a population moves from a condition of very high community structure to a more moderate level, the number of people with high betweenness increases. This highlights a particularly interesting contrast with previous models: epidemics are more likely and larger in populations with highly unequal distributions of contacts on the one hand, but also in populations with more equal betweenness.

With the information that betweenness predicts the extent of epidemic spread in populations with community structure, we sought a means to use such information to design intelligent control measures. How do you find people who have high betweenness? As abstract as the concept of betweenness may seem, it turns out to not be that difficult. We start with an infected person and do standard contact tracing. That is, we ask the index case about his or her contacts. Contact tracing is one of the most important tools in the toolkit of the gumshoe epidemiologist. From the index case’s contacts, we pick a random individual and trace his or her contacts. Picking a random individual from this second generation of contact traces, we simply ask "do you know the index case?" If so, we keep going: trace the contacts of a random contact, ask again if this person knows the index case. When we come to an individual who does not know the index case, we have found our bridge. It is the penultimate person in the chain – the person who links the index case to someone he or she doesn’t know. Basically, we do a “random walk” on the social network looking for people who link otherwise unconnected individuals. When we find the bridge, we vaccinate all of his/her contacts. We call our vaccination algorithm the “Community Bridge Finder” (CBF).

When we vaccinate according to this algorithm, we reduce the final size of the epidemic far more than randomly vaccinating the same fraction of people. More interestingly, CBF also does better than the other vaccination algorithm that uses only local network information typically available to epidemiological investigators. This algorithm, known as the “Acquaintance Method,” vaccinates a randomly selected contact of an index case. The idea behind the acquaintance method is that the contacts of a case are more likely than chance to be highly connected individuals themselves in a population with heterogeneous contacts. That is, given that you have a contact, you’re on average more likely to be connected to a hub than to someone with few connections because hubs simply have more connections.

Of course, the way that we constructed our contact networks, we stacked the deck against the acquaintance method. Remember, everyone has the same number of contacts; what varies is how many contacts are within versus between communities. One of the great limiting factors for progress in social network analysis – and network epidemiology in particular – is the paucity of detailed network data from well-defined human populations. A domain that has garnered a lot of interest recently is the analysis of networks created by social media such as Facebook and Twitter. We used data from Facebook when its use was still restricted to particular college campuses to provide networks on which infections could pass. Facebook users typically have many contacts, probably way more than people have in epidemiologically relevant networks. However, because the data come from college acquaintance networks, we were able to prune the networks down toward something hopefully more epidemiologically appropriate. We kept contacts in the networks only if two individuals shared one a several key attributes such as shared dorm or major. What this yielded were a series of networks with heterogeneous contact structure and quite a bit of community structure (the measure of community structure hovered near the values where epidemics transitioned from small to large in our simulated networks). Once again, CBF outperformed the acquaintance method. This provided very strong evidence that community structure really matters for epidemic behavior and that exploiting information on community structure allows us to better control outbreaks of infectious disease.

Plotting Networks in R

Using the network package, you can plot graphs in a flexible and powerful way.  Often, when plotting a network, we want to vary the color, size, or shape of the vertices based on some attributes.  Let's say that we have a freewheeling sexual network (easier to simulate) and we would like to color the vertices of the graph according to their HIV sero-status.  Let's also say that we want to make the shape of each vertex reflect the sex of the individual.  We use the following code:

[r]
# begin with randomly constructed edgelist
set.seed(12345)

n1 <- round(1+10*runif(50)) n2 <- round(1+10*runif(50)) eee <- cbind(n1,n2)[order(n1),] net <- network(eee,directed=FALSE) # this will be a dense network! hiv <- rbinom(50,size=1,prob=0.2) # random infections! sex <- rbinom(50,size=1,prob=0.5) # random sexes! set.vertex.attribute(net,"hiv",hiv+1) set.vertex.attribute(net,"sex",sex+3) ## now plot plot(net, vertex.col="hiv", vertex.sides="sex", vertex.cex=5, vertex.rot=-30, edge.lwd=1) [/r] I definitely wouldn't want to be part of that party.